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DECLARATION OF VISHWANATH R. IYER, Ph D 
UNDER 37 C.F.R. S 1.132 



I, VISHWANATH R. IYER, Ph.D., declare and state as 

follows : 

1. I am an Assistant Professor in the Section of 
Molecular Genetics and Microbiology, Institute of Cellular and 
Molecular Biology, University of Texas at Austin, where my 
laboratory currently studies global transcriptional control in 
yeast, gene expression programs during human cell 
proliferation, and genome-wide transcription factor targets in 
yeast and human. Immediately prior to this position, I spent 
four years as a postdoctoral fellow in the laboratory of 
Patrick 0. Brown at Stanford University studying the 
transcriptional programs of yeast and of human cells. My 
curriculum vitae is attached hereto as Exhibit A. 

2. Beginning in Dr. Brown's laboratory, where I 
helped to develop the first whole genome arrays for yeast and 
early versions of highly representative cDNA arrays for human 
cells, and continuing to the present day, I have used 
microarray-based gene expression analysis as a principal 
approach in much of my research. 



3. Representative publications describing this 
work include: 



DeRisi J. et al., -Exploring the metahnu,. 
scale,' Science 278:680-686 (1997);' s^o^c 

th e resp'onsl o^^SSE^S*" - 

Science 283:83-87 (1999) ; J and ' 

Nature Genetics 24: 227-235 (2000). ' 
Two of the papers describe our use of microarray-based 
expression profiling to explore the metabolic reprogramming 
that occurs during major environmental changes, both in yeast 
(DeRisi et al., during the shift from fermentation to 
respiration) and in human cells (Iyer et al., human 
fibroblasts exposed to serum, . One reference describes our 
use of expression profile analysis in drug target validation 
and identification of secondary drug effects ,„ ar ton et al ) 
And one describes our use of expression profiling as a 
molecular phenotyping tool to discriminate among human cancer 
cells (Ross et al . ) . 

4. Whether used to elucidate basic physiological 
responses, to study primary and secondary drug effects, or to 
discriminate and classify human cancers, expression profiling 
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The resolution of the patterns used in such 
colons is determined hy the numher of ge nes detected the 
neater the number of ge nes detected, tne higher the 
resolution of the pattern. lt goes without 

resolution patterns are general W 9 * 

are generally more useful in such 

comparisons than lower resolution patterns, with such higher 
resolutions comes a correspondingly higher degree of 
statistical confidence for ri-;*,*-- • , . 

distinguishing different patterns 
as well as identifying similar ones. Patterns. 
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probe added to a microarray thus increases the nuxnber of genes 
detectable by the device, increasing the resolving power of 
the device. As I note above, higher resolution patterns are 
generally more useful in comparisons than lower resolution 
patterns. Accordingly, each new gene probe added to a 
microarray increases the usefulness of the device in ge ne 
expression profiling analyses. This proposition is so well- 
established as to be virtually an axiom in the art, and has 
been as long as I have been working in the field, and 
certainly since the time I embarked on the production of whole 
genome arrays in early 1996. simply put, arrays with fewer 
gene-specific probes are inferior to arrays with more gene- 
specific probes. 

«• For example, our ability to subdivide cancers 
into discriminable classes by expression profiling is limited 
by the resolution of the patterns produced, with more genes 
contributing to the expression patterns. we can potentially 
draw finer distinctions a™,ng the patterns, thus subdividing 
otherwise indistinguishable cancers into a greater number of 
classes; the greater the number of classes, the greater the 
likelihood that the cancers classified together will respond 
similarly to therapeutic intervention, permitting better 
individualization of therapy and, we hope, better treatment 
outcomes . 

9. If a gene does not change expression in an 
experiment, or if a gene is not expressed and produces no 
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signal an an experiment, that is not to say that the probe 
lacks usefulness on the array; it only means that an 
insufficient number of conditions have been sampled to 
identify expression changes. In fact, an experiment showing 
that a gene is not expressed or that its expression level does 
not change can be equally informative. To provide maximum 
versatility as a research tool, the microarray should 
rnclude - and as a biologist I would want my microarray to 
rnclude - each newly identified gene as a probe. 

10. I declare further that all statements made 
herein of my own knowledge are true and that all statements 
made on information and belief are believed to be true, and 
further that these statements were made with the knowledge 
that willful false statements and the like so made are 
punishable by fine or imprisonment, or both, under 
Section 1001 of Title 16 of the United States Code and may 
jeopardize the validity of any patent application in which 
this declaration is filed or any patent that issues thereon 
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Vishwanath R. Iyer 

Assistant Professor 

Section of Molecular Genetics and Microbiology 

Institute of Cellular and Molecular Biology 

MBB3.212A, University of Texas at Austin 

Austin, TX 78712-0159 

Phone: 512-232-7833 

Fax: 512-232-3432 

Email: vishy@mail.utexas.edu 

Education/Training 

Bombay University Mumbai, India B.Sc. (1987), Chemistry & Biochemistry 

M. S. University of Baroda, Baroda, India M.Sc. (1989), Biotechnology . 

Harvard University, Cambridge MA Ph.D. (1996), Genetics 

Stanford University, Stanford CA Post-doctoral (1996-2000), Genomics 

Research Experience 

9/00-5/ 03 Assistant professor, Section of Molecular Genetics and 
Microbiology, University of Texas, Austin TX 

■ Global transcriptional control in yeast 

■ Gene expression programs during human cell proliferation 

" £enome-wide transcription factor targets in yeast and human 

■ Collaborative microarray facility 

5/96-8/00 Post-doctoral fellow Stanford University, Stanford CA 
(Advisor: Dr. Patrick O. Brown) 

■ Yeast whole-genome ORF and intergenic microarrays 

■ Human cDNA microarrays for expression profiling 

9/89-4/96 Graduate student Harvard University, Cambridge MA 

(Advisor: Dr. Kevin Struhl) 

■ Yeast transcriptional regulation 



Honours and Awards 

Government of India Biotechnology Fellowship (1987-1989) 
University Grants Commission Junior Research Fellowship (1989) 
Stanford University/NHGRI Genome Training Grant (1996) 

Invited Conference talks (selected) 

Invited Lecturer, NEC-Princeton Lectures in Biophysics 

Princeton, NJ (June 1998) 
Plenary Session Speaker, HGM "99 (HUGO Human Genome Meeting) 

Brisbane, Australia (April 1999) 

Invited Speaker, Gordon Research Conference "Human Molecular Genetics" 
Newport, RI (August 2001) 



Invited Speaker, Nature Genetics "Oncogenomics 2002" Conference 
Dublin, Ireland (May 2002) 

InV Tlf, w'r"^ en ? S B 2 l0 F : Gen0mic A PP roa <*es to Transcriptional 

Regulation Cold Spnng Harbor Laboratory Meeting (March 200*) 
Symposium co-Chair and Speaker "Functional Genomics" American Society for 

Biochemistry and Mo ecu ar Biology Meeting, San Diego, CA (April 20%) 
Invited Speaker m Functional Genomics (Gene Networks) Symposium, International 

Congress of Genetics, Melbourne Australia July 6-11 2003 ^ernationai 
Invited Speaker "BioArrays Europe 2003" 

Cambridge, UK (Sep/Oct 2003) 

Departmental Seminars 

^Octob^ 2^2002* GenCtiCS BiochemistI ? & Departments, 

New York University School of Medicine, Department of Biochemistry 
November 20 2002 J> 

UT Southwestern Medical Center, Human Genetics Seminar Series 
May 5 2002 ' 

UCLA School of Medicine, Department of Human Genetics 
June 2 2003 

National Human Genome Research Institute 
June 12 2003 

Sanger Institute of the Wellcome Trust, Hinxton UK 
Sep 2003 ' 

Other Professional Activities 

R To e oT) r Gen ° me Bi ° l ° m Gen ° me Research > Nature Genetics, Science (1998- 

InS S 0 ''2?03) SPrinS Harb ° r SUmmer C ° UrSe " Making and Usin « DNA Microarrays" 
Member, NIDDK Special Emphasis Review Panel ZDKi (2001-2002) 

Publications 

1. lyerV & Struhl, K. (1995) Poly(dA:dT), a ubiquitous promoter element that 
stimulates transcription via its intrinsic DNA structure, EMBO J. 14: 2570-2579. 

2 ' ^ ^ ^l' \} l ?% 5 ]^Y ni5m ° f differenti al utilization of the his 3 TR and TC 
TATA elements, Mol. Cell Biol. 15: 7059-7066. 

3. Jm2L & Struhl K. (1996) Absolute mRNA levels and transcription initiation rates in 
Saccharomyces cerevisiae. Proc. Natl. Acad. Sci . (USA) 93:5208-5212. 



4 ' f & - Br0Wn R a (1997) Ae metab °"c and genetic 

control of gene expression on a genomic scale. Science 278:680-686 

5. Marton M. J., DeRisi J. L., Bennett H. A., IverV.R.. Meyer M. R Roberts C J 

f 0 ^Friend S n ade , D " M P^" D K Jr " "jrown 

P. 0. ft Friend S L H. (1998) Drug target validation and identification of secondary 
drug target effects using DNA microarrays. Nature Med. 4:1293-1301 Qaiy 

6. LutfiyyaL L. JyerV R, , DeRisi J., DeVit M. J., Brown P. 0. & Johnston M (1998) 

Characterization of three related glucose repressors and genes they regulate in 
Saccharomyces cerevisiae. Genetics 150:1377-1391 eguwwin 

? ' S o el BTtlD * Si?? G -« Z r han i M o Q -' IyerV ' R - A**™ Eisen M. B., Brown P. 
0., Botstein D. & Futcher B. (1998) Comprehensive identification of cell cycle- 

MBM. CeTgwllw 5accAa ™^ by microarray hybridization. 

8. IyerV.R., Eisen M. B., Ross D. T., Schuler G Moore T Lee T r f T„ n f t ^ 
Steudt L. M., Hudson Jr. J., Boguski M. S. Lashtari l'., So MEta ft 

STh *°; (1 " 9) T 5 s .«™»»*Ptta»I P^gram in the response of tam^ 
fibroblasts to serum. Science 283:83-87 

9 ' L " & Cl " 9) Gen ° miCS 3nd 3rray technol °gy- Curr. Opin. Oncol. 

10. Ross D T ScherfU Eisen M. B., Perou C. M., Spellman P., IyerV.R R ees C 
Jeffrey S. S., Van de Rrjn M., Waltham M., Pergamenschikov A^Teej! C ^ ' 
Lashkan D., Shalon D., Myers T. G., Weinstein J. N., Botstein D., & Brown P O 

S §2R^S fa S6ne eXPreSSi ° n ~ S in human ™°ine, 

11. Sudarsanam P IyerV.R Brown P. 0. & Winston F. (2000) Whole-genome 

97: ^T-L^ S mUtantS ° f ^ Cerei;i ' SJ " ae - ^ Natl Sci .(USA) 

12 ' '^n^^^^ aJL ^^ J ° hnS0n A - D - (200 °) The chr °™ domain 
Kj^SSSS^ mg ^ 18 " ATP " dependem chromatin-modifying factor 

13. Gross C, Kelleher M. , Iy^JL Brown P. O., & Winge D. R.. (2000) Identification 
of the copper regulon m Saccharomyces cerevisiae by DNA microarrays J stl 
Chem. 275: 32310-32316 y L 

14. Reid J. L lyerJ^R,, Brown P. O. & Struhl K. (2000) Coordinate regulation of yeast 
nbosomal protein genes is associated with targeted recruitment of Esai histone 
acetylase. Mol Cell 6: 1297-1307 



15- Horak C Scafe C. S., Botstein D., Snyder M. & Brown P. O. (2001) 

^S^SSf ^ Cell_Cyde traDSCripti0n fectore SBFand MBF 

16. Mild R. Kadota K., Bono H Mizuno Y., Tomaru Y., Carninci P., Itoh M Shibata K. 
Kawai J., Konno H Watanabe S., Sato K., Tokusumi Y., Kikuch N., Ishi'i Y ' 
Hamapichi Y Nishizukal., Goto H., Nitanda H., Satomi S, Yoshild A Kusakabe 

nv ib ^ M /?" IverV *» Brown P -°-> Muramateu M., ShimadaH 

Okazala Y. & Hay^hizaki Y. (2001) Delineating developmental and metabolic ' 
pathways in vivo by expression profiling using the RIKEN set of ifi ft 1 * ^ 
enriched mouse cDNA arrays Proc. Nad. Acaa\SkL(lUSA) 9^: 21^9^220^ 

18. IyerV.R. Microarray-based detection of DNA protein interactions: Chromatin 
Immunoprecipitation on Microarrays, in DNA Microarrays: A Mole£™Clonino 

Soot.' ' D ' & Sambr ° 0k ' J ° 453 " 463 (C ° ld Sprfng HSS^teSSy 

*(notpeer reviewed) 

19. Killion, P., Sherlock G. and IyerV^ (2003) The Longhorn Array Database an 
open-source implementation of the Stanford Microarray Database BMC 
Bioinformatics 4: 32 

20. Hahn J. S., Hu Z., Thiele D. J. & IyerJLR, Genome- Wide Analvsis nfth* * 
Stress Responses Through Heat fas,^^ 

21. Kim J. & lyeim The global role of TBP recruitment to promoters in mediating 
gene expression profiles (manuscript in preparation) mediating 



Current/Pending Research Support 

U01 AA13518-01 Adron Harris (PI) 25% effort 

9/28/01 - 9/27/06 

NIH/NIAAA 

"INIA: Microarray Core" 

^a^a 3 !^ 3 reSP ^u Se t0 integrative Neuroscience Initiative on Alcoholism 
£??fi ^^-01-002. The overall goal is to support the use of microarray tedmXv 

consumer 65 " ^ eXPreSSi ° n PFedict ° r aCCOmpanv exceiS^SSbd 

Role: Co-investigator 



003658-0223-2001 Iyer (PI) 16% effort 
01/01/02 - 08/31/04 

Texas Higher Education Coordinating Board (ARP) 

^Mkroarray based global mapping of DNA-protein interactions at promoters in human 

P^motere 0t Pr ° JeCt t0 ^ *** ^ ™° interactions of transcription factors with human 
Role: PI 



Information Technology Research 0325116 R. Mooney (PI) 0% effort 

09/01/03-08/31/07 

NSF 

D^cove^^ 111 Multi " S ° UrCe Data Minin S to Experimentation for Gene Network 
Role: Co-investigator 



1 R01 CA95548-01A2 (pending) Iyer (PI) 25% effort 

12/1/03 - 11/30/08 

NIH 

"Analysis of genome-wide transcriptional control in yeast" 

Zte STS 1 ? reSPOnSiVe tranSCription toor ^ in trough 
Role: PI 



Breast Cancer Idea Award (pending) Iyer (PI) 10% effort 
1/1/04 - 12/31/06 

US Army Medical Research and Materiel Command 

"Genome-wide chromosomal targets of oncogenic transcription factors" 

This is a project aimed at identifying direct chromosomal targets of c-myc and ER in 

human cells through the use of a novel sequence tag analysis method 

003658-0531-2003 (pending) Marcotte (PI) 8% effort 
01/01/04 - 12/31/05 

Texas Higher Education Coordinating Board (ATP) 

genomSe" n ° Vel high_thr0Ugh P ut P latform for measuring gene function on a 
This proposal is aimed at developing a novel microarray based platform for automated 
miCr ° SCO P1C lmaging 0f cells > a »°™ g -Pid and systematic evatatfon 
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fischer-vne. Science 270. 1828 (1995). 

35. T. C. James and S. C. Bgtn, Mcl CeiB&. 6. 3862 
(1886}; R Pare and D. S. Hogness, Proa NatL Acad 
Sd USA 68, 263 (1991); B. Tscttersch ef at. 
EMBOJ. 13. 3822 (1994); M.T. Madreddtf at.Cef 
87. 75 (1 99Q; 0. a Stokes. K. D. Tartof . R P. Peny t 
Anoa Atett AcaflL Set U&A 93, 7137 (1996). 

36. P. M. Patosaari etaL,J. BioL Chem. 266. 10750 
•(1991); A. Schmitz, K. K Gartemam, j. Fieder, E. 



Grund. R Bcheniaub, Aflpt ewren. Microdot. 58, 
4068 (1992); V. Sharma. K. Suvama, R Mega* 
natnan. M. E Hudspeth, J. Bacterid 174. 5057 

(1992) ; M. Kanazawa et al. Enzyme Protein 47. 9 

(1993) ; 2. L Boynton, G. N. Bennet F. B. Rudobh. 
Jl Sacfenot 178, 3015 (1996). 

37. M. Ho et at. Caff 77. 869 (1994). 

38. W.Hendrfcsefai.J. CeBBiocnem. 59.418(1995). 

39. We thank H SkaJetsky and F. Lewitter tor help with 
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Exploring the Metabolic and Genetic Control of 
Gene Expression on a Genomic Scale 

Joseph L DeRisi, Vishwanath R. Iyer, Patrick O. Brown* 

DNA microarrays containing virtually every gene of Saccharomyces cerevisiae were used 
to carry out a comprehensive investigation of the temporal program of gene expression 
accompanying the metabolic shift from fermentation to respiration The expression 
profiles observed for genes with known metabolic functions pointed to features of the 
metabolic reprogramming that occur during the diauxic shift, and the expression patterns 
of many previously uncharacterized genes provided clues to their possible functions The 
same DNA microarrays were also used to identify genes whose expression was affected 
by deletion of the transcriptional co-repressor TUP1 or overexpression of the transcrip- 
tional activator YAP1. These results demonstrate the feasibility and utility of this ap- 
proach to genomewide exploration of gene expression patterns. 



The complete sequences of nearly a dozen 
microbial genomes are known, and in the 
next several yean we expect to know the 
complete genome sequences of several 
metazoans, including the human genome. 
Defining the role of each gene in these 
genomes will be a formidable task, and un- 
derstanding how the genome functions as a 
whole in the complex natural history of a 
living organism presents an even greater 
challenge. 

Knowing when and where a gene is 
expressed often provides a strong clue as to 
its biological role. Conversely, the pattern 
of genes expressed in a cell can provide 
detailed information about its state. Al- 
though regulation of protein abundance in 
a cell is by no means accomplished solely 
by regulation of mRNA, virtually all dif- 
ferences in cell type or state are correlated 
with changes in the mRNA levels of many 
genes. This is fortuitous because the only 
specific reagent required to measure the 
abundance of the mRNA for a specific 
gene is a cDNA sequence. DNA microar- 
rays, consisting of thousands of individual 
gene sequences printed in a high-density 
array on a glass microscope slide (J, 2), 
provide a practical and economical tool 
for studying gene expression on a very 
large scale (3-6). 

Saccharomyces cerevisiae is an especially 

Department of Biochemistry, Stanford University School 
of Medicine. Howard Hughes Medical Institute. Stanford 
CA 94305-5428, USA. 

*To whom corre sp ondence should be addressed. E-mail: 
pbrov^@crngm.stantcrd.ecO 



favorable organism in which to conduct a 
systematic investigation of gene expression. 
The genes are easy to recognize in the ge- 
nome sequence, cis regulatory elements are 
generally compact and close to the tran- 
scription units, much is already known 
about its genetic regulatory mechanisms, 
and a powerful set of tools is available for its 
analysis. 

A recurring cycle in the natural history 
of yeast involves a shift from anaerobic 
(fermentation) to aerobic (respiration) me- 
tabolism. Inoculation of yeast into a medi- 
um rich in sugar is followed by rapid growth 
fueled by fermentation, with the production 
of ethanol. When the fermentable sugar is 
exhausted, the yeast cells turn to ethanol as 
a carbon source for aerobic growth. This 
switch from anaerobic growth to aerobic 
respiration upon depletion of glucose, re- 
ferred to as the diauxic shift, is correlated 
with widespread changes in the expression 
of genes involved in fundamental cellular 
processes such as carbon metabolism, pro- 
tein synthesis, and carbohydrate storage 
17). We used DNA microarrays to charac- 
terize the changes in gene expression that 
take place during this process for nearly the 
entire genome, and to investigate the ge- 
netic circuitry that regulates and executes 
this program. 

Yeast open reading frames (ORFs) were 
amplified by the polymerase chain reaction 
(PCR), with a commercially available set of 
primer pairs (8). DNA microarrays, con- 
taining approximately 6400 distinct DNA 
sequences, were printed onto glass slides by 



using a simple robotic printing device (9). 
Cells from an exponentially growing culture 
of yeast were inoculated into fresh medium 
and grown at 30°C for 21 hours. After an 
initial 9 hours of growth, samples were har- 
vested at seven successive 2-hour intervals, 
and mRNA was isolated (JO). Fluorescentiy 
labeled cDN A was prepared by reverse tran- 
scription in the presence of Cy3(green)- 
or Cy5(red)-labeled deoxyuridine triphos- 
phate (dUTP) (11) and then hybridized to 
the microarrays (12). To maximize the re- 
liability with which changes in expression 
levels could be discerned, we labeled cDNA 
prepared from cells at each successive time 
point with Cy5, then mixed it with a Cy3- 
labeled "reference" cDNA sample prepared 
from cells harvested at the first interval 
after inoculation. In this experimental de- 
sign, the relative fluorescence intensity 
measured for the Cy3 and Cy5 fluors at 
each array element provides a reliable mea- 
sure of the relative abundance of the corre- 
sponding mRNA in the two cell popula- 
tions (Fig. 1). Data from the series of seven 
samples (Fig. 2), consisting of more than 
43,000 expression-ratio measurements, 
were organized into a database to facilitate 
efficient exploration and analysis of the 
results. This database is publicly available 
on the Internet (13). 

During exponential growth in glucose- 
rich medium, the global pattern of gene 
expression was remarkably stable. Indeed, 
when gene expression patterns between the 
first two cell samples (harvested at a 2-hour 
interval) were compared, mRNA levels dif- 
fered by a factor of 2 or more for only 19 
genes (0.3%), and the largest of these dif- 
ferences was only 2.7-fold (14). However, as 
glucose was progressively depleted from the 
growth media during the course of the ex- 
periment, a marked change was seen in the 
global pattern of gene expression. mRNA 
levels for approximately 710 genes were 
induced by a factor of at least 2, and the 
mRNA levels for approximately 1030 genes 
declined by a factor of at least 2. Messenger 
RNA levels for 183 genes increased by a 
factor of at least 4, and mRNA levels for 
203 genes diminished by a factor of at least 
4. About half of these differentially ex- 
pressed genes have no currently recognized 
function and are not yet named. Indeed, 
more than 400 of the differentially ex- 
pressed genes have no apparent homology 
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to any gene whose function is known (15). 
The responses of these previously unchar- 
acterized genes to the diauxic shirt therefore 
provides the first small clue to their possible 
roles. 

The global view of changes in expres- 
sion of genes with known functions pro- 
vides a vivid picture of the way in which 
the cell adapts to a changing environ- 
ment. Figure 3 shows a portion of the yeast 
metabolic pathways involved in carbon 
and energy metabolism. Mapping the 
changes we observed in the mRNAs en- 
coding each enzyme onto this framework 
allowed us to infer the redirection in the 
flow of metabolites through this system. 
We observed large inductions of the genes 
coding for the enzymes aldehyde dehydro- 
genase (ALD2) and acetyl-coenzyme 
A(CoA) synthase (ACS/), which func- 
tion together to convert the products of 
alcohol dehydrogenase into acetyI-CoA t 
which in turn is used to fuel the tricarbox- 
ylic acid (TCA) cycle and the glyoxylate 
cycle. The concomitant shutdown of tran- 
scription of the genes encoding pyruvate 
decarboxylase and induction of pyruvate 
carboxylase rechannels pyruvate away 
from acetaldehyde, and instead to oxalac- 
etate, where it can serve to supply the 
TCA cycle and gluconeogenesis. Induc- 
tion of the pivotal genes PCKl t encoding 
phosphoenolpyruvate carboxykinase, and 
FBPl, encoding fructose 1,6-biphos- 
phatase, switches the directions of two key 
irreversible steps in glycolysis, reversing 
the flow of metabolites along the revers- 
ible steps of the glycolytic pathway toward 
the essential biosynthetic precursor, glu- 
coses-phosphate. Induction of the genes 
coding for the trehalose synthase and gly- 
cogen synthase complexes promotes chan- 
neling of glucose-6-phosphate into these 
carbohydrate storage pathways. 

Just as the changes in expression of 
genes encoding pivotal enzymes can pro- 
vide insight into metabolic reprogram- 
ming, the behavior of large groups of func- 
tionally related genes can provide a broad 
view of the systematic way in which the 
yeast cell adapts to a changing environ- 
ment (Fig. 4). Several classes of genes, 
such as cytochrome c-related genes and 
those involved in the TCA/glyoxylate cy- 
cle and carbohydrate storage, were coord i- 
nately induced by glucose exhaustion. In 
contrast, genes devoted to protein synthe- 
sis, including ribosomal proteins, tRNA 
synthetases, and translation, elongation, 
and initiation factors, exhibited a coordi- 
nated decrease in expression. More than 
95% of ribosomal genes showed at least 
twofold decreases in expression during the 
diauxic shift (Fig. 4) (J3). A noteworthy 
and illuminating exception was that the 



genes encoding mitochondrial ribosomal 
genes were generally induced rather than 
repressed after glucose limitation, high- 
lighting the requirement for mitchondrial 
biogenesis (J3). As more is learned about 
the functions of every gene in the yeast 
genome, the ability to gain insight into a 
cell's response to a changing environment 
through its global gene expression patterns 
will become increasingly powerful. 

Several distinct temporal patterns of ex- 
pression could be recognized, and sets of 
genes could be grouped on the basis of the 
similarities in their expression patterns. The 
characterized members of each of these 
groups also shared important similarities in 
their functions. Moreover, in most cases, 
common regulatory mechanisms could be 
inferred for sets of genes with similar expres- 
sion profiles. For example, seven genes 
showed a late induction profile, with mRNA 
levels increasing by more than ninefold at 
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the last timepoint but less than threefold at 
the preceding timepoint (fig. 5B). All of 
these genes were known to be glucose-re- 
pressed, and five of the seven were previously 
noted to share a common upstream activat- 
ing sequence (UAS), the carbon source re- 
sponse element (CSRE) (16-20). A search 
in the promoter regions of the remaining two 
genes, ACR1 and 1DP2, revealed that 
ACR1, a gene essential for ACS J activity, 
also possessed a consensus CSRE motif, but 
interestingly, IDP2 did not. A search of the 
entire yeast genome sequence for the con- 
sensus CSRE motif revealed only four addi- 
tional candidate genes, none of which 
showed a similar induction. 

Examples from additional groups of 
genes that shared expression profiles are 
illustrated in Fig. 5, C through F. The 
sequences upstream of the named genes in 
Fig. 5C a ll contain stress response ele- 
ments (STRE), and with the exception 




Fig. 1. Yeast genome microarray. The actual size of the microarray is 18 mm by 18 mm The 
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of HSP42, have previously been shown to 
* be controlled at least in pan by these 
elements (21-24). Inspection of the se- 
quences upstream of HSP42 and the two 
uncharacterized genes shown in Fig. 5C, 
YKL026c, a hypothetical protein with 
similarity to glutathione peroxidase, and 
YGR043c, a putative transaldolase, re- 
vealed that each of these genes also pos- 
sess repeated upstream copies of the stress- 
responsive CCCCT motif. Of the 13 ad- 
ditional genes in the yeast genome that 
shared this expression profile [including 
HSP30, ALD2, OM45, and 10 uncharac 
terized ORFs (25)1, nine contained one or 
more recognizable STRE sites in their up- 
stream regions. 

"The heterotrimeric transcriptional acti- 
vator complex HAP2 f 3,4 has been shown 
to be responsible for induction of several 
genes important for respiration (26-28). 
This complex binds a degenerate consensus 
sequence known as the CCAAT box (26). 
Computer analysis, using the consensus se- 
quence TNRYTGGB (29), has suggested 
that a large number of genes involved in 
respiration may be specific targets of 
HAP2,3,4 (30). Indeed, a putative 
HAP2,3,4 binding site could be found in 
the sequences upstream of each of the seven 
cytochrome c-related genes that showed 
the greatest magnitude of induction (Fig. 
5D). Of 12 additional cytochrome c-related 
genes that were induced, HAP2 f 3,4 binding 
sites were present in all but one. Signifi- 
cantly, we found that transcription of 
HAP4 itself was induced nearly ninefold 
concomitant with the diauxic shift. 

Control of ribosomal protein biogenesis 
is mainly exerted at the transcriptional 
level, through the presence of a common 
upstream-activating element (UAS ) 
that is recognized by the Rapl DNA-bind- 
ing protein (3J, 32). The expression pro- 
files of seven ribosomal proteins are shown 
in Fig. 5F. A search of the sequences 
upstream of all seven genes revealed con- 
sensus Rapl -binding motifs (33). It has 
been suggested that declining Rapl levels 
in the cell during starvation may be re- 
sponsible for the decline in ribosomal pro- 
tein gene expression (34). Indeed, we ob- 
served that the abundance of RAP I 
mRNA diminished by 4.4-fold, at about 
the time of glucose exhaustion. 

Of the 149 genes that encode known or 
putative transcription factors, only two, 
HAP4 and S/P4, were induced by a factor of 
more than threefold at the diauxic shift. 
SIP4 encodes a DN A -binding transcrip- 
tional activator that has been shown to 
interact with Snfl, the "master regulator" of 
glucose repression (35). The eightfold in- 
duction of SIP4 upon depletion of glucose 
strongly suggests a role in the induction of 



downstream genes at the diauxic shift. 

Although most of the transcriptional 
responses that we observed were not pre- 
viously known, the responses of many 
genes during the diauxic shift have been 
described. Comparison of the results we 
obtained by DNA microarray hybridiza- 
tion with previously reported results there- 
fore provided a strong test of the sensitiv- 
ity and accuracy of this approach. The 
expression patterns we observed for previ- 
ously characterized genes showed almost 
perfect concordance with previously pub- 
lished results (36). Moreover, the differ- 
ential expression measurements obtained 
by DNA microarray hybridization were re- 
producible in duplicate experiments. For 
example, the remarkable changes in gene 
expression between cells harvested imme- 
diately after inoculation and immediately 
after the diauxic shift (the first and sixth 
intervals in this time series) were mea- 
sured in duplicate, independent DNA mi- 
croarray hybridizations. The correlation 
coefficient for two complete sets of expres- 
sion ratio measurements was 0.87, and for 
more than 95% of the genes, the expres- 



sion ratios measured in these duplicate 
experiments differed by less than a factor 
of 2. However, in a few cases, there were 
discrepancies between our results and pre- 
vious results, pointing to technical limita- 
tions that will need to be addressed as 
DNA microanay technology advances 
(37 , 38). Despite the noted exceptions, 
the high concordance between the results 
we obtained in these experiments and 
those of previous studies provides confi- 
dence in the reliability and thoroughness 
of the survey. 

The changes in gene expression during 
this diauxic shift are complex and involve 
integration of many kinds of information 
about the nutritional and metabolic state 
of the cell. The large number of genes 
whose expression is altered and the diver- 
sity of temporal expression profiles ob- 
served in this experiment highlight the 
challenge of understanding the underlying 
regulatory mechanisms. One approach to 
defining the contributions of individual 
regulatory genes to a complex program of 
this kind is to use DNA microarrays to 
identify genes whose expression is affected 



Fig. 2. The section of the ar- 
ray indicated by the gray box 
in Rg. 1 is shown tor each of 
the experiments described 
here. Representative genes 
are labeled. In each of the ar- 
rays used to analyze gene 
expression during the diauxic 
shift, red spots represent 
genes that were induced rel- 
ative to the initial timepoint, 
and green spots represent 
genes that were repressed 
relative to the initial timepoint. 
In the arrays used to analyze 
the effects of the tuplb mu- 
tation and YAPl overexpres- 
sion, red spots represent 
genes whose expression was 
increased, and green spots 
represent genes whose ex- 
pression was decreased by 
the genetic modification. Note 
that distinct sets of genes are 
induced and repressed in the 
different experiments. The 
complete images of each of 
these arrays can be viewed on 
the Internet (13). Cell density 
as measured by optical densi- 
ty (OD) at 600 nm was used to 
measure the growth of the 
culture. 
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by mutations in each putative regulatory 
gene. As a test of this strategy, we analyzed 
the genomewide changes in gene expression 
that result from deletion of the TUP] gene. 
Transcriptional repression of many genes by 
glucose requires the DNA-binding repressor 



Migl and is mediated by recruiting the tran- 
scriptional co-repressors Tupl and Cyc8/ 
Ssn6 (39). Tupl has also been implicated in 
repression of oxygen-regulated, mating-type- 
specific, and DNA-damage-inducible genes 
(40). 
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Wild-type yeast cells and cells bearine 
a deletion of the TUPl gene (ti4>IA) were 
grown in parallel cultures in rich medium 
containing glucose as the carbon source 
Messenger RNA was isolated from expo- 
nentially growing cells from the two pop- 
ulations and used to prepare cDNA la- 
beled with Cy3 (green) and Cy5 (red), 
respectively (11), The labeled probes were 
mixed and simultaneously hybridized to 
the microarray. Red spots on the microar- 
ray therefore represented genes whose 
transcription was induced in the tup J A 
strain, and thus presumably repressed by 
Tupl (41). A representative section of the 
microarray (Fig. 2, bottom middle panel) 
illustrates that the genes whose expression 
was affected by the rupJA mutation, were, 
in general, distinct from those induced 
upon glucose exhaustion (complete images 
of all the arrays shown in Fig. 2 are avail- 

S'/i^w 1 ? 1 ™" U3)] - Ncv «thele* t 
.H U0%) of the genes that were induced 

by a factor of at least 2 after the diauxic 
shift were similarly induced by deletion of 
TUPl, suggesting that these genes may be 
subject to TUPl -mediated repression by 
glucose. For example, SUC2, the gene en- 
coding invertase, and all five hexosc trans- 
porter genes that were induced during the 
course of the diauxic shift were similarly 
induced, in duplicate experiments, by the 
deletion of TUPL 

The set of genes affected by Tupl in this 
experiment also included a-glucosidases, 
the mating-type-specific genes MFAJ and 

dm£?' *?Lt* DNA ^ge-inducible 
RNR2 and RNR4 t as well as genes involved 

in flocculation and many genes of unknown 
function. The hybridization signal corre- 
sponding to expression of TUPl itself was 
also severely reduced because of the (in- 
complete) deletion of the transcription unit 
in the tupl A strain, providing a positive 
control in the experiment (42). 

Many of the transcriptional targets of 
Tupl fell into sets of genes with related 
biochemical functions. For instance, al- 
though only about 3% of all yeast genes 
appeared to be TUP J -repressed by a factor 
of more than 2 in duplicate experiments 
under these conditions, 6 of the 13 genes 
that have been implicated in flocculation 
[15) showed a reproducible increase in 
expression of at least twofold when TUPl 
was deleted. Another group of related 
genes that appeared to be subject to TUPl 
repression encodes the serine-rtch cell 
wall mannoproteins, such as Tipl and 
Tirl/Srpl which are induced by cold 
shock and other stresses (43) , and similar, 
senne-poor proteins, the seripauperins 
(44). Messenger RNA levels for 23 of the 
26 genes in this group were reproducibly 
elevated by at least 2.5-foId in the tuplA 
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^ strain, and 18 of these genes were induced 
« by more than sevenfold when TV? I was 
deleted. In contrast, none of 83 genes that 
could be classified as putative regulators of 
the cell division cycle were induced more 
than twofold by deletion of TUPL Thus, 
despite the diversity of the regulatory sys-' 
terns that employ Tupl, most of the genes 
that it regulates under these conditions 
fall into a limited number of distinct func- 
tional classes. 

Because the microarray allows us to 
monitor expression of nearly every gene in 
yeast, we can, in principle, use this ap- 
proach to identify all the transcriptional 
targets of a regulatory protein like Tupl. It 
is important to note, however, that in any 
single experiment of this kind we can only 
recognize those target genes that are nor- 
mally repressed (or induced) under the 
conditions of the experiment. For in- 
stance, the experiment described here an- 
alyzed a MAT a strain in which MFA1 
and MFA2, the genes encoding the a- 
factor mating pheromone precursor, are 
normally repressed. In the isogenic tupl A 
strain, these genes were inappropriately 
expressed, reflecting the role that Tupl 
plays in their repression. Had we instead 
carried out this experiment with a MATA 
strain (in which expression of MFA! and 
MFA2 is not repressed), it would not have 
been possible to conclude anything re- 
garding the role of Tupl in the repression 
of these genes. Conversely, we cannot dis- 
tinguish indirect effects of the chronic 
absence of Tupl in the mutant strain from 
effects directly attributable to its partici- 
pation in repressing the transcription of a 
gene. 

Another simple route to modulating the 
activity of a regulatory factor is to overex- 
press the gene that encodes it. YAPI en- 
codes a DNA-binding transcription factor 
belonging to the b-zip class of DNA-bind- 
ing proteins. Ovcrexpression of YAPI in 
yeast confers increased resistance to hydro- 
gen peroxide, o-phenanthroline, heavy 
metals, and osmotic stress (45). We ana- 
lyzed differential gene expression between a 
wild-type strain bearing a control plasmid 
and a strain with a plasmid expressing YAPI 
under the control of the strong GAU-10 
promoter, both grown in galactose (that is, 
a condition that induces YAPJ overexpres-' 
sion). Complementary DNA from the con- 
trol and YAPI overexpressing strains, la- 
beled with Cy3 and Cy5, respectively, was 
prepared from mRNA isolated from the two 
strains and hybridized to the microarray. 
Thus, red spots on the array represent genes 
that were induced in the strain overexpress- 
ing YAP/. 

Of the 17 genes whose mRNA levels 
increased by more than threefold when 
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VAPJ was overexpressed in this way, five 
bear homology to aryl-alcohol oxidoreduc- 
tases (Fig. 2 and Table 1). An additional 
four of the genes in this set also belong to 
die general class of dehydrogenases/oxi- 
doreductases. Very little is known about 
the role of aryl-alcohol oxidoreductases in 
5. cerewswe, but these enzymes have been 
isolated from ligninolytic fungi, in which 
they participate in coupled redox reac- 
tions, oxidizing aromatic, and aliphatic 
unsaturated alcohols to aldehydes with the 
production of hydrogen peroxide (46, 47) 
The fact that a remarkable fraction of the 
targets identified in this experiment be- 
long to the same small, functional group of 
oxidoreductases suggests that these genes 
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might play an important protective role 
during oxidative stress. Transcription of a 
small number of genes was reduced in the 
strain overexpressing Yapl. Interestingly, 
many of these genes encode sugar per- 
meases or enzymes involved in inositol 
metabolism. 

nrfrlT^JZ Ya P^inding «tes 
(TTACTAA or TGACTAA) in the se- 
quences u^tteam of the target genes we 
identified (48). About two-thirds of the 
genes that were induced by more than 
threefold upon Yapl overexpression had 
one or more binding sites within 600 bases 
upstream of the start codon (Table 1), sug- 
gesting that they are directly regulated by 
lapl. The absence of canonical Yapl-bind- 



Rg. 4. Coordinated reg- 
ulation of functionally re- 
lated genes. The curves 
represent the average in- 
duction or repression ra- 
tios for all the genes in 
each indicated group. 
The total number of 
genes in each group was 
as follows: ribosomal 
proteins, 1 12; translation 
elongation and initiation 

late-cycle enzymes. 24. oxidase and reductase proteins, 19; and TCA- and ghyoxy- 

tor whK* the average increase in mRNA teveT^TK^^ t f t "° dupto,,e experiments, and 
Posrtons of the canonical Yapl binding srtes ^^STT* ™ 9rea,er ,hreeto « 
average toH-incraase in mRNA fcve* the 
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ing sites upstream of the others may reflect 
an ability of Yapl to bind sites that differ 
from the canonical binding sites, perhaps in 
cooperation with other factors, or less like- 
ly, may represent an indirect effect of Yapl 
overexpression, mediated by one or more 
intermediary factors. Yapl sites were found 
only four times in the corresponding region 
of an arbitrary set of 30 genes that were not 
differentially regulated by Yapl. 

Use of a DNA microarray to character- 
ize the transcriptional consequences of 
mutations affecting the activity of regula- 
tory molecules provides a simple and pow- 
erful approach to dissection and character- 
ization of regulatory pathways and net- 



works. This strategy also has an important 
practical application in drug screening. 
Mutations in specific genes encoding can- 
didate drug targets can serve as surrogates 
for the ideal chemical inhibitor or modu- 
lator of their activity. DNA microarrays 
can be used to define the resulting signa- 
ture pattern of alterations in gene expres- 
sion, and then subsequently used in an 
assay to screen for compounds that repro- 
duce the desired signature pattern. 

DNA microarrays provide a simple and 
economical way to explore gene expres- 
sion patterns on a genomic scale. The 
hurdles to extending this approach to any 
other organism are minor. The equipment 
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required for fabricating and using DNA 
microarrays (9) consists of components 
that were chosen for their modest con and 
simplicity. It was feasible for a small group 
to accomplish the amplification of more 
than 6000 genes in about 4 months and, 
once the amplified gene sequences were in 
hand only 2 days were required to print a 
set of 110 microarrays of 6400 elements 
each. Probe preparation, hybridixation, 
and fluorescent imaging are also simple 
procedures. Even conceptually simple ex- 
periments, as we described here, can yield 
vast amounts of information. The value of 
l k C l ^ foi ] nation from «ch experiment of 
this kind will progressively increase as 
more is learned about the functions of 
each gene and as additional experiments 
define the global changes in gene expres. 
sion in diverse other natural processes and 
genetic perturbations. Perhaps the greatest 
challenge now is to develop efficient 
methods for organizing, distributing, inter- 
preting, and extracting insights from the 
large volumes of data these experiments 
will provide. 
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We describe here a method for drug target validation and identification of secondary drug tar* 
get effects based on genome-wide gene expression patterns. The method is demonstrated by 
several experiments, including treatment of yeast mutant strains defective in calcineurin, inv 
munophilins or other genes with the immunosuppressants cyclosporin A or FK506. Presence or 
absence of the characteristic drug 'signature' pattern of altered gene expression in drug-treated 
cells with a mutation in the gene encoding a putative target established whether that target was 
required to generate the drug signature. Drug dependent effects were seen in 'targetless' cells, 
showing that FK50S affects additional pathways independent of calcineurin and the inv 
munophilins. The described method permits the direct confirmation of drug targets and recog- 
nition of drug-dependent changes in gene expression that are modulated through pathways 
distinct from the drug's intended target. Such a method may prove useful in improving the effi- 
ciency of drug development programs. 



Good drugs are potent and specific; that is. they must have 
strong effects on a specific biological pathway and minimal ef- 
fects on all other pathways. Confirmation that a compound in- 
hibits the intended target (drug target validation) and the 
identification of undesirable secondary effects are among the 
main challenges in developing new drugs. Comprehensive 
methods that enable researchers to determine which genes or 
activities are affected by a given drug might improve the effi- 
ciency of the drug discovery process by quickly identifying po- 
tential protein targets, or by accelerating the identification of 
compounds likely to be toxic. DNA microarray technology, 
which permits simultaneous measurement of the expression 
levels of thousands of genes, provides a comprehensive frame- 
work to determine how a compound affects cellular metabolism 
and regulation on a genomic scale 1 ' 11 . DNA microarrays that 
contain essentially every open reading frame (ORF) in the 
Saccharomyces cerevisiae genome have already been used success- 
fully to explore the changes in gene expression that accompany 
large changes in cellular metabolism or cell cycle progression 7 ,0 . 

In the modern drug discovery paradigm, which typically be- 
gins with the selection of a single molecular target, the ideal in- 
hibitory drug is one that inhibits a single gene product so 
completely and so specifically that it is as if the gene product 
were absent. Treating cells with such a drug should induce 
changes in gene expression very similar to those resulting from 
deleting the gene encoding the drug s target. Here we have com- 
pared the genome-wide effects on gene expression that result 
from deletions of various genes in the budding yeast 5. cerevisiae 
to the effects on gene expression that result from treatment 



with known inhibitors of those gene products. Using the cal- 
cineurin signaling pathway as a model system, we tested an ap- 
proach that permits identification of genes that encode proteins 
specifically involved in pathways affected by a drug. The FK506 
characteristic pattern, or signature', of altered gene expression 
was not observed in mutant cells lacking proteins inhibited by 
FK506 (for example, a calcineurin or FK506-binding-protein 
mutant strain), but was observed in mutants deleted for genes 
in pathways unrelated to FK506 action (for example, a cy- 
clophilin mutant strain). Conversely, the cyclosporin A (CsA) 
signature was not observed in CsA-treated calcineurin or cy- 
clophilin mutant strains, but was seen in an FK506-binding- pro- 
tein mutant strain treated with CsA. The method also 
demonstrates that FK506, a clinically used immunosuppressant, 
has off-target' effects that are independent of its binding to im- 
munophilins. Thus, the approach we describe may provide a 
way to identify the pathways altered by a drug and to detect 
drug effects mediated through unintended targets. 

Null mutants phenocopy drug-treated cells on a genomic scale 
To test whether a null mutation in a drug target serves as a 
model of an ideal inhibitory drug, we examined the effects on 
gene expression associated with pharmacological or genetic in- 
hibition of calcineurin function. Calcineurin is a highly con- 
served calcium- and calmodulin-activated serine/threonine 
protein phosphatase implicated in diverse processes dependent 
on calcium signaling 1213 . In budding yeast, calcineurin is re- 
quired for intracellular ion homeostasis' 4 , for adaptation to pro- 
longed mating pheromone treatment 15 and in the regulation of 
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Rg.1 Model of antagontsm of the cafctneum signaling pathway mediated 
by FK506 and cyclosporin A (CsA). Calcineurin aciivtty is composed ofacat- 
ar/Ue sutxintt (calcineurin A encoded in yeast by the CAM? and CNA2 genes), 
and caldunvbinding regulatory subunits calmodulin (CMD) and calcineurin B 
(CnB). After entering cells, FK506 and CsA specifically bind and inhibit the 
peptWyH>roline bomerase activity of their respective immunophiiins. FK506 
binding proteins (FKBP) and cydophilins (CyP). The most abundant im- 
munophiiins in yeast (Fori and Cphl) are thought to mediate calcineurin in- 
hibition. Drug-immunophflin complexes bind and inhibit the calcium- and 
calrrtc<Jiilirwtimulated phosphatase calcineurin. Among the substrates of cal- 
cineurin are uanscriptional activators that act to modulate gene expression 



the onset of mitosis u . In mammals, calcineurin has been impli- 
cated in T-cell activation 12 , in apoptosis' 7 . in cardiac hypertro- 
phy 1 * and in the transition from short-term to long-term 
memory". In both organisms, calcineurin activity is inhibited 
by FK506 and CsA. immunosuppressant drugs whose effects on 
calcineurin are mediated through families of intracellular recep- 
tor proteins called immunophiiins 1 " 0 (Fig. l). To assess the ef- 
fects of pharmacologic inhibition of calcineurin. wild-type S. 
: cerevisiae was grown to early logarithmic phase in the presence 
j or absence of FK506 or CsA. Isogenic cells, from which the 
J genes encoding the catalytic subunits of calcineurin (CNA1 and 
\ CNA2) had been deleted 2 ' (referred to as the cna or calcineurin 
j mutant), were grown in parallel, in the absence of the drug. 
| Fluorescently-labeled cDNA was prepared by reverse transcrip- 
[ tion of polyA- RNA in the presence of Cy3- or CyS-deoxynu- 
cleotide triphosphates and then hybridized to a microarray 
> containing more than 6.000 DNA probes representing 97% of 
the known or predicted ORFs in the yeast genome. 
Simultaneous hybridization of Cy5-labeled cDNA from mock- 
treated cells and Cy3-labeled cDNA from cells treated with 1 
Mg/ml FK506 allowed the effect of drug treatment on mRNA lev- 
els of each ORF to be determined (Fig. 2a and b and data not 
shown). Similarly, effects of the calcineurin mutations on the 
mRNA levels of each gene were assessed by simultaneous hy- 
bridization of CyS-labeled cDNA from wild-type cells and Cy3- 
labeled cDNA from the calcineurin mutant strain (Fig. 2c). For 
each comparison of this kind, reported expression ratios are the 
average of at least two hybridizations in which the Cy3 and Cy5 
fluors were reversed to remove biases that may be introduced by 
gene-specific differences in incorporation of the two fluors 
(data not shown). 

Treatment with FK506 in these growth conditions resulted in 
a signature pattern of altered gene expression in which mRNA 
levels of 36 ORFs changed by more than twofold 
(http://www.rosetta.org). A very similar pattern of altered gene 
expression was observed when the calcineurin mutant strain 
was compared to wild-type cells. Comparison of the changes in 
mRNA expression of each gene resulting from treatment of 
wild-type cells with FK506 with mRNA expression changes re- 
sulting from deletion of the calcineurin genes showed the con- 
siderable similarity of the global transcript alterations in 
response to the two perturbations (Fig. 2b-d). Quantification of 
this similarity using the correlation coefficient (p) showed 
large correlations between the FK506 treatment signature and 
the calcineurin deletion signature (p * 0.75 ± 0.03). as well as 
the CsA treatment signature (p « 0.94*0.02). but not with a 
randomly selected deletion mutant strain (deleted for the 
YER071C gene; p - -0.07 ± 0.04; Fig. 2e). The FK506 treatment 
signature was also compared with those of more than 40 other 
deletion mutant strains or drug-treatments thought to affect 
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unrelated pathways, and none had statistically significant cor- 
relations. These data establish that genetic disruption of cal- 
cmeunn function provides a close and specific phenocopy of 
treatment with FK506 or CsA. 

To avoid generalizing from a single example, we also com- 
pared the effects of treatment of wild-type cells with 3-aminotri- 
azole (3-AT) with the effects of deletion of the H1S3 gene HIS3 
encodes imidazoleglycerol phosphate dehydratase, which cat- 
alyzes the seventh step of the histidine blosynthetic pathway in 
yeast": 3-AT is a competitive inhibitor of this enzyme that trig- 
gers a large transcriptional amino-acid starvation response" 
Microarray analysis of wild-type and isogenic Ais3-deficiem 
strains demonstrated the expected large genome-wide transcrip- 
tional responses (involving more than 1.000 ORFs) resulting 
from treatment with 3-AT (Fig. 3a) or from H1S3 deletion (Fig 
3^ Quantitative comparison of the 3-AT treatment signature 
and the his3 mutant signature showed a high level of correlation 
(P- 0.76 * 0.02) that even extended to genes that experienced 
small changes in expression level (Fig. 3«. As a negative control 
the correlations between the 3-AT treatment signature or the 
hls3 mutant signature and the calcineurin mutant strain were 
not statistically significant (p » 0.09 1 0.06 and -0.01 * 0 04 re- 
spectively). That both the calcineurin/FK506 and the hls3/3-AT 
comparisons were highly correlated indicates that in many cases 
the expression profile resulting from a gene deletion closely re- 
sembles the expression profile of wild-type cells treated with an 
inhibitor of that gene's product. 
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Decoder' strategy: Drug target validation with deletion mutants 
Because pharmacological inhibition of different targets might 
give similar or identical expression profiles, simple comparison 
of drug signatures to mutant signatures is unlikely to unambigu- 
ously identify a drug's target. To overcome this limitation, an 
additional decoder' step is used. We first compare the expres- 
sion profile of wild-type drug-treated cells to the expression pro- 
files from a panel of genetic mutant strains, using a correlation 
coefficient metric. Mutant strains whose expression profile is 
similar to that of drug-treated wild-type cells are selected and 
subjected to drug treatment, generating the drug signature in 
the mutant strain (that is. the mutant drug signature). If the 
mutated gene encodes a protein involved in a pathway affected 
by the drug, we expect the drug signature in mutant cells to be 
different (or absent, for an ideal drug) from the drug signature 
seen in wild-type cells. 
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Fig. 2 Expression profiles from 
FK506-treated wild-type (wt) 
cells and a calcineurin-disruptlon 
mutant strain share a genome- 
wide correlation. DNA mtcroarray 
analysis showing changes in gene 
expression resulting from FK506 
treatment (a and b) or from ge- 
netic disruption of genes encod- 
ing calcineurin (c). «. Pseudo- 
color image of the results of si- 
multaneous hybridization of Cy5- 
labeled cONA (red) from 
mock-treated strain R563 and Cy3-lebeled cONA 
(green) from strain RS63 treated with 1 tig/ml FK506. 
b. Enlarged view of the boxed area in a. Arrowheads in- 
dicate specific ORFs induced or repressed, c. Pseudo- 
color image of the results of simultaneous hybridization 
of Cy5-labeied cONA (red) from strain RS63 and Cy3- 
labeled cDNA (green) from strain MCY300 (deleted for 
the CNA1XNA2 catalytic subunits of calcineurin). 
Arrows indicate specific ORFs induced or repressed, d, 
The log, 0 of the expression ratio for each ORF derived 
from the FK506 treatment hybridizations ts plotted ver- 
sus the log*, of the expression ratio in the calcineurin 
mutant hybridizations. ORFs that were induced or re- 
pressed in both experiments are shown as green and 
red dots, respectively. : The log, 0 of the expression ratio for each ORF de- 
rived from the FK506 treatment hybridizations is plotted versus the log t0 



wi 1 Mg /ml FK506 



wt vs. catanenjrin muum 
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Log, 0 (R/G) calcineurin muution 



log„ (R/G) ycrC7U mutation 



of the expression ratio in the yer071c mutant hybridizations. No ORFs 
were induced or repressed in both experiments. 
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To illustrate this, we treated the his3 mutant strain with 3- 
AT. The signature pattern of altered gene expression resulting 
from treatment of the mutant strain with 3-AT was much less 
complex than that of the 3-AT signature in wild-type cells (Fig. 
4). This is seen simply by examining plots of mean intensity of 
the hybridization signal (which approximately reflects level of 
expression) versus the expression ratio for each ORF (Fig. 4). 
Genes that were expressed at higher or lower levels in 3-AT 
treated cells or in his3 mutant cells are shown as red and green 
dots, respectively. We analyzed the 3-AT signature in wild-type 
(Fig. 4a) and his3 mutant cells (Fig. 4c), as well as the his3 mu- 
tant strain signature (Fig. 4b). Whereas histidine limitation in- 
duced by 3-AT induced more than 1,000 transcription-level 
changes in the wild-type strain, few or no transcript level 
changes were induced by treatment of the /ws3-deletion strain 
with 3-AT. This indicates that with the growth conditions used, 
essentially all of the effects of 3-AT depend on or are mediated 
through the HIS3 gene product. 

Applying this approach to the calcineurin signaling pathway 
showed the specificity of the method. The calcineurin mutant 
strain and strains with deletions in the genes encoding the 
most abundant immunophilins in yeast 12 {CPHI and FPR1) 
were treated with either FK506 or CsA to determine the profiles 
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Signature correlation of expression ratios as a result of FK506 
treatment in various mutant strains 





wild-type 


ens 


fprl 


cna fprl 


cphl 




♦/-FK506 


+/-FK506 


+/-FK506 


+/-FK506 


+/-FK506 


wild-type 










♦/- FK506 


0.93 ± 0.04 


-0.01 t 0.07 


-0.23 i 0.07 


0.12 ±0.07 


0.79 i 0.03 



v ~' ~* "wv» >'y«Jiurc specuceuy in me calcineurin (cna) end for; 
(major FK506 binding protein) deletion mutants, cna represents the mutant with deletions of the catalytic sub 
units of calcineurin. CNA1 and CNA2. The correlation coefficient reported in the first column represents the cor 
relation between two pairs of hybridisations from independent wild-type ♦/- FK506 experiments 



of altered gene expression resulting from drug treatment of the 
mutant cells (that is. mutant +/- drug). We compared the drug 
signatures in the mutants to the wild-type drug signature using 
the correlation coefficient metric (Table 1). Although the signa- 
ture generated by treatment of wild-type cells with FK506 was 
highly correlated to the calcineurin mutant strain signature (p 
= 0.75 ± 0.03). it bore no similarity to the profile after treat- 
ment of the calcineurin mutant strain with FK506 (p « -0.01 ± 
0.07). This indicates that FK506 was unable to elicit its normal 
transcriptional response in the calcineurin mutant strain. 
Likewise, treatment of the fprl mutant strain with FK506 
elicited an expression profile that was not correlated to the 
FK506 signature in the wild-type strain (p - -0.23 ± 0.07). indi- 
cating that the FPR1 gene product is likely to be involved in the 
pathway affected by FK506. The same was true for the cna fprl 
mutant strain. In contrast, treatment of the cphl mutant strain 
with FK506 generated an expression profile highly correlated 
with the wild-type FK506 expression profile (p « 0.79 ± 0.03). 
indicating the cphl mutation did not block the mode of action 
of FK506 and thus is not directly involved in the pathway af- 
fected by FK506. We tabulated the change in expression in re- 
sponse to FK506 in different mutant strains for all ORFs with 
expression ratios greater than 1.8 in FK506-treated cells or in 
the calcineurin mutant strain (Fig. 5a).The 
calcineurin mutant strain signature and the 
FK506 responses in wild-type and the cphl 
mutant strain are similar, and there are no 
transcript-level changes (seen in black) for 
treatment of the calcineurin. [prl and cna 
fprl mutant strains with FK506 (Fig. 5a). 

Similar experiments and analyses with CsA 
provided further validation of this approach. 
The expression profile elicited by treatment 
of wild-type cells with CsA was highly corre- 
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Fig. 3 Expression profiles 

from a hh3 mutant strain 

and wild-type (wt) cells 

treated with 3-AT share a 

genome-wide correlation. 

DNA microarray analysis 

showing changes rn gene 

expression resulting from 3- 
AT treatment (a) or from ge- 
netic disruption of the HIS3 
gene (c). s. Pseudo-color 
image of the results of simul- 
taneous hybridization of 
CyS-labeled cDNA (red) from mock-treated wild-type strain R491 end 
Cy3-labeled cDNA (green) from strain R491 treated with 10 mM 3-AT. 
*. Plot of the log 10 of the expression raUo for each ORF derived from the 
3-AT treatment hybridizations is plotted versus the log„ of the expression 
ratio in the his3 mutant hybridizations. ORFs that were induced or re- 
pressed in both experiments are shown as green and red dots, respec- 
tively. The correlation of expression ratios applies not only to genes with 
large expression ratios (for example, CHA 1 and ARGl). but also extends to 
genes with expression ratios less than 2 (for example, ILV1 and CPH1). 
ILV1 is induced 1 .9-fold and 1 ,5-foid. and CPH1 is downregulated 1 .9-fold 
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and 1 .7-fold. in cells treated with 3-AT and his3 mutant cells, respectively. 
Two ORFs do not fall on the line x - y. The leftmost point is the HIS3 data 
point, which is induced by 3-AT treatment but which is not absent from 
the his3 mutant strain. The other point is YOR203w. Both data points are 
labeled HIS3 because hybridization to YOR203wM most likely due to HIS3 
mRNA, as YOR203w overlaps the HIS3 open reading frame. «. Pseudo- 
color image of the results of simultaneous hybridization of Cy5-labeled 
cDNA (red) from wild-type strain R491 and Cy3-labeled cDNA (green) 
from strain R1226. deleted for the HIS3 gene. Arrowheads indicate spe- 
ciRc ORFs induced or repressed. 
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lated to the profile elicited by mutation of the calcineurin genes 
(p - 0.71 ± 0.04). but did not correlate with the expression pro- 
file resulting from treatment of the calcineurin mutant strain 
with CsA (p - -0.05 ± 0.07; Table 2). indicating that the genetic 
deletion of calcineurin interfered with the ability of CsA to 
elicit its normal transcriptional response. Likewise, the CsA sig- 
nature was essentially absent in CsA-treated cphl mutant cells, 
and the expression profile of CsA-treated cphl mutant cells cor- 
related poorly to that of CsA-treated wild-type cells (p » 0.18 ± 
0.07). Thus, the CPH1 gene product was required for the CsA re- 
sponse seen in wild-type cells. Conversely, treatment of fprJ 
mutant cells with CsA resulted in an expression pattern very 
similar to the profile of CsA-treated wild-type cells (p = 0.77 ± 
0.03). indicating that FPR1 was not necessary for the CsA-medi- 
ated effects. Analysis of individual ORFs affected by CsA and 
their expression ratios over the entire set of experiments con- 
firmed that CPHl and the genes encoding calcineurin. but not 



FPR1. are necessary for the wild-type CsA response (Fig. 56). The 
observation that the profiles resulting from FK506 or CsA drug 
treatment are similar to that of the calcineurin deletion mutant 
strain might allow the prediction that calcineurin was involved 
in the pathway affected by these drugs. But because the expres- 
sion profile of the fprl mutant strain did not bear a strong simi- 
larity to the wild-type drug expression pronie for FK506. it is 
obvious that the drug treatment of the mutant strains was nec- 
essary to identify Fprl . but not Cphl . as a potential FK506 drug 
target. In the same way, the decoder' strategy was necessary to 
identify Cphl. but not Fprl. as a potential drug target for CsA. 

'Decoder' approach can identify secondary drug effects 
For a drug that has a single biochemical target, the strategy out- 
lined above may be useful in target validation. In many cases, 
however, a compound may affect multiple pathways and elicit 
a very complex signature. 'Decoding' such a complex signature 
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Fig. 4 Treatment of the his3 mutant strain with 3-AT shows nearly com- 
plete loss of 3-AT signature. A plot of the log„ of the mean intensity of hy- 
bridization for each ORF versus the log t0 of its expression ratio for each 
experiment is shown next to a pseudo-color image of a representative 
portion of the microarray. ORFs that are induced or repressed at the 95% 
confidence level are shown in green and red, respectively, m. Expression 
pronie from treatment of the wild- type (wt) strain with 3-AT. Cy5- labeled 
cDNA (red) from mock-treated strain R491 and Cy3-labeled cDNA 
(green) from strain R491 treated with 10 mM 3-AT. b. Expression profile 
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from the his3 deletion strain. CyS-labeled cDNA (red) from suain R491 
and Cy3-labeled cDNA (green) from strain R1226. deleted for the HIS3 
gene, e, Expression profile of treatment of the his3 deletion strain with 3- 
AT. Cy3-labeled cDNA (red) from n/s3-deieted strain R1226 and CyS-la- 
beled cDNA (green) from strain R1226 treated with 10 mM 3-AT. 
Arrowneads indicate the DNA probe and data point corresponding to the 
HIS3 gene. The blue dashed line represents the threshold below which er- 
rors tend to increase rapidly because spot intensities are not sufficiently 
above background intensity. 
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Signature correlation of expression ratios as a result of CsA 
treatment in various mutant strains 



wild-type 
♦/-CsA 



wild-type 
♦/-CsA 



ens 
♦/-CsA 



<ar7 
♦/-CsA 



cna cphl 
♦/-CsA 



0.941 0.04 -0,05 1.07 0,77*0.03 -0.m0.07 



<p/)7 
♦/-CsA 

0.18 1 0.07 
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into the effects mediated through the intended target (the on- 
target signature') and those mediated through unintended tar- 
gets (the off-targef signature) might be useful in evaluating a 
compound's specificity. Our decoder' strategy is based on the 
premise that off-target' signature should be insensitive to the 
genetic disruption of the primary target. 

To determine whether the decoder' approach could identify 
an off-targef profile, we looked for a drug-responsive gene 
whose expression is insensitive to deletion of the primary tar- 
get. To increase the likelihood of observing such genes the 
same strains described in Tables 1 and 2 were treated with 
higher concentrations (50 ug/ml) of FK506. This led to a much 
more complex expression profile in wild-type cells, indicating 
that at this higher concentration. FK506 was inhibiting or acti 
™'"f addJt, ° nal «^ets. Several of the ORFs in this expanded 
FK506-induced expression profile were not affected by the cal 
cineurin. cphl or fprl mutations, as drug treatment of these mu- 
tant strains did not block their presence in the FK506 
expression signature (Fig. 6). This indicates that FK506 was trie 
gering changes in transcript levels of many genes through path- 
ways independent of caldneurin. CPHl and FPRL Many of the 
upregulated ORFs in the off-targef pathway were genes re- 
ported to be regulated by the transcriptional activator Ccn4 
(ref. 24). In some strains, a reporter gene under CCN4 control 
was induced in response to FK506 treatment". To determine 
whether GCN4 is involved in this pathway that is independent 
of calcineurin. CPHl and FPRl. we analyzed the effects of treat- 
ment with high-dose FK506 on global gene expression in a 
strain with a CCN4 deletion (Fig. 6). Of the 41 ORFs with cal- 
cmeurin-independent expression ratios greater than 4 32 were 
not induced in the >gcn4 mutant, indicating that their induction 
by FK506 was CCAK-dependent. Not all GCM-regulated eenes 
were induced by FK506. This FK506-induced subset of CCN4 
regulated genes may be those most sensitive to subtle chances 
in Gcn4 levels, or perhaps other regulatory circuits prevent 
FK506 activation of some GCW-regulated genes. Seven of the 
remaining nine ORFs induced by FK506 were independent of 



both the calcineurin and CCN4 pathways The 
simplest explanation is that FK506 inhibits or 
activates additional pathways. Members of this 
class include SNQ2 and PDR5. genes that en- 
code drug efflux pumps with structural homol- 
ogy to mammalian multiple drug resistance 
proteins". FKS06 may interact directly with 
PdrS to inhibit its function". Our results indi- 
cate that treatment with FK506 leads to four- 
fold-to-sixfold induction of PDR5mKNA levels. 
YORl. another gene that can confer drug resis- 

FKsn fi tu ^ tanCe • * ak0 induced threefold-to-fourfold by 
FK506. Thus, drug treatment of strains with mutations in the 

Ev ^ Pr ° Ve U " ful in ident %ing effects mediated 

by secondary drug targets, including the nature and extent of 

Sd\X e nS. and PreVi0US ' y UnSU$PCCted P8thWayS * 
We describe here a method for drug target validation and the 
ident.ficat.on of secondary drug target effects that uses DNA mi- 
croarrays to survey the effects of drugs on global gene expres- 

i S nhib P r a " ernS r ^ T bliShed th8t ■*"«* and PharmacXgt 
inh.bmon of gene function can result in extremely similar 
changes ,n gene expression. We also demonstrated that one can 
confirm a potential drug urge, by treating a deletion mutant 
defective .n the gene encoding the putative target. Drug-medi- 
ated s.gnatures from strains with mutations in pathways or 
processes d.rectly or indirectly affected by the drug bore little or 



Q Strain: 



FKS06 



cm wt 



cphl 



fprl 



cna cna fprl 



Fig. 5 Response of FK506 and CsA signature genes in strains wtth deletions 
.n d.fferent genes. Genes with expression ratios greater than a facto, o * Z 
response to treatment with 1 ug/ml FK506 (a) or 50 ug/ml CsA (6) are listed 
(left s.de) and their expression ratios in the indicated strain are shown on the 

and FK506 treatment sxjneture genes are in the first two columns. Almost all 
FK506 signature genes have expression ratios near unity in deletion strains 
-nvolved in pathways affected by FK506 (calcineurin. fprl and cna fprl rnu 
tants) but not in deletion strains in unrelated pathways (cphl). *. Calcineurin 
(cna) mutant and CsA ueatmem signature gene, are in the n™Z 0 
columns. Almost all CsA signature genes have expression ratios near unity in 
deletion strains involved in pathways affected by CsA (calcineurin. cphl and 
crw cphl mutants) but not in deletion strains in unrelated pathways (fpr J) 
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no similarity to the wild-type drug expression profile In con 
trast. drug-mediated signatures from strains with mutations in 
genes involved in pathways unrelated to the drug's action 
showed extensive similarity to the wild-type drug signature By 

JESS 8 aPPr0a ° h l ° 8 drU8 th3t affects multi P le Pathways 
(FK506). we were able to decode a complex signature into com- 
ponent parts, including the identification of an off-tarcef sie 
nature that was mediated through pathways independent of 
calcineurin or the Fprl immunophilin. 

Discussion 

It is well-established that high-throughput biochemical screen- 
ing can identify potent inhibitory compounds against a given 
target. The 'decoder" approach described here complements 
this process by evaluating the equally important property of 
specificity: the tendency of a compound to inhibit pathways 
other than that of its intended target. The ability to observe 
such off-targef effects will likely be useful in several ways 
Profihng compounds with known toxicities will allow the de 
velopment of a database of expression changes associated with 
particular toxicities. Recognition of potential toxicities in the 
•off-targef signatures of otherwise promising compounds then 
may allow earlier identification of those likely to fail in clinical 
trials. Comparing the extent and peculiarities of off-targef sib 
natures of promising drug candiates could provide a new way 
to group compounds by their effects on secondary pathways 
even before those effects are understood. This may prove to be 
an alternative, potentially more effective, way to select com- 
pounds for animal and clinical trials. Some drugs are more ef 
fective against a related protein than against the originally 
intended target. Sildenafil (Viagra™), for example, was initially 
developed as a phosphodiesterase inhibitor to control cardiac 
contractility, but was found to be highly specific for phospho- 
diesterase 5. an isozyme whose inhibition overcomes defects in 
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■on) color scale. The genes have been divided into dasL corrt 
spending to these expected behaviors: -CA*_ependenr oenes 
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except when GCN4 is deleted. These gerW^ii. resrWto F«06 
when calaneunn genes or FPR1 or CPH1 .re deleted: that is. their re- 
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independent genes respond to FK506 in ,11 deletion straire 
tested. A complex behavcr' das is provided for those genes that did 
not match the model of FK506 response n^iated^vough S 
cineunn or Fprl or separately through Gcn4. 9 

penile erection. It is possible that application of the 'de- 
coder to other compounds may show that they too have a 
potent activity against a target distinct from their in- 

tended target. 

The ability to decode drug effects is dependent on the 
availability of functionally targetless' cells. In yeast this 
>s being achieved by systematically disrupting each yeast 
gene {Saccharomyces Deletion Consortium: http //se- 
quence-www.stanford.edu/group/yeast_deleUon pro- 
ject/deletion.html). Efforts are underway to obtain 
— expression profiles from each deletion mutant strain 
Deterrroning signatures resulting from inactivation of es-' 
sential genes presents a unique problem, but it may be 
possible to do so by examining heterozygotes or by using a con- 
trouble promoter to reduce expression of the essential gene 
Although it is already feasible to test several compounds in 
dozens of yeast strains, another challenge for the decoder- 
strategy will be the efficient selection of the mutants with dele- 
tions in genes most likely to encode the intended drug target 
The signature correlation plots described are one metric that 
could be used as par, of that selection process, but others need 
to be explored. Applying the decoder' to mammalian cells pre- 
sents additional challenges. It is considerably more difficult to 
isolate functionally targetless' cells. Strategies involving titrat- 
able promoters, known specific inhibitors, anti-sense RNAs rl- 
bozymes. and methods of targeting specific proteins for 
degradation are possible and should be tested. Another Hmita- 
t.on is that not all cell types express the same set of genes and 

vr!« 7 8Ct ' effeCtS bC d,fferent in dif *™< cell 

types. In addition, applying the decoder' to human cells will 

also require technical improvements that allow expression pro- 
Wing from a small number of cells. Even the broader question 
of whether the insensitivity of off-targef signatures to the dis- 
ruption of the main target is the exception or the rule can only 
be answered by the accumulation of more data. Barkai anj 
Le bier, however, have argued in favor of robustness of biolo R l- 
cal networks, indicating that drug perturbations (off-tarRef 
signatures) may be robust even when the system is subjected to 
another perturbation (such as a genetic disruption) ref. 28) 
Many pract.cal developments will be necessary if the 'decoded 
concept is to be broadly applied. 

Expression arrays have been used mainly as an initial screen 
for genes induced in a particular tissue or process of interest by 
focusing on genes with large expression ratios. We have 
found, however, that effort to refine experimental protocols 
and repeat experiments increases the reliability of the data and 
permits new applications. For example, it provides a larger set 
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Table 3 Yeast strains used 



Strain 

YPH499 

R563 

R5S8 

R567 

MCY300 

R132 

R133 

R559 

BY4719 

BY4738 

R491 

BY4728 

BY4729 

R1226 



Relevant genotype 

Mata ura3-S2tyi2-801 ade2-101 trp1-A63 his3-A200 ieu2-Al 

Mata ura3-52ty52401 ade2-101 trp 1 -A63 his3-A200 ieu2-A 1 N$3::HtS3 

Mata ura3-S2fys2-801 ade2-101 trp 1 -A63 his3-A200 Ieu2-A 7 fpr1::HIS3 

Mats ura3-S2tys2-80l ade2-101 trp1-A63his3-A200ieu2^A1 cph1::H!S3 

Mat* ura3-S2 tys2-80l ade2- 101 trp1-A63 hi$3~A200 Ieu2-A 7 cna U 1::hisC cna2A1::HIS3 

Mata ura3-52iys2-801 ade2-101 trp1-A63ha3-A200teu2-A1 cnaU1::hisGcna2A1::HI$3cphVkarf 

Mata ura3-52tys2-801 ade2-101 trpUA63 his3-A200 feu2-A1 cnalA1::hisGcna2Al::HiS3tprVkarf 

Mata ura3-S2tyi2-801 ade2-101 trpUA63 h&3-A200 Ieu2-A1 his3::H!S3 ocn4LEU2 

Mstatrp1-A63 ura3-A0 

Mata trpl-A 63 ura3-A0 

Mate/a BY4719XBY4738 

Mata his3-A200 trp1-A63 ura3-A0 

Mata his3-A 200trp1*A63 ura3-A0 

Mata/a BY4728 XBY4729 



Reference 
(34) 

(this study) 
(this study) 
(this study) 
(21) 

(this study) 
(this study) 
(this study) 
(35) 
(35) 

(this study) 

(35) 

(35) 

(this study) 



of genes at higher confidence levels that serve as a more 
unique signature for a given protein perturbation. In addition, 
g it allows subtle signatures to be detected, when, for example, a 
8 protein is only partially inhibited. This may enable clinical 
| monitoring of small changes in protein function in disease or 
g toxicity states before they could otherwise be detected. 
6 Because the functions of many genes detected on transcript ar- 
5 rays are known, these microarrays are powerful tools that pro- 
| vide detailed information about a cell's physiology. For 
^ example, changes in the flux through a metabolic pathway are 
g reflected in transcriptional changes in genes in the pathway 7 . 
*. Furthermore, it may be possible to indirectly measure protein 
c activity levels from expression profiling data (S.F.. et aL, un- 
S published data). Thus, although the eventual development of 

1 genomic methods allowing the direct measurement of all eel- 
< lular protein levels will be an important achievement, tran- 
j5 script array technology offers an immediate and robust means 

2 of evaluating the effects of various treatments on gene expres- 
co sion and protein function. 

o> 

& Methods 

Construction, growth and drug treatment of yeast strains. The strains 
used in this study (Table 3) were constructed by standard techniques". 
To construct strain R559. strain R563 was transformed to Leu' with plas- 
mid pM12 digested by Sail and MltA (provided by A. Hinnebusch and T. 
Dever). Strains R132 and R1 33 were constructed by transforming the bac- 
terial kanamycin resistance cassette* 0 flanked by genomic DNA from the 
CPH1 and FPR1 loci, respectively, and selecting for G4 1 8-resistant 
colonies. For experiments with FK506, cells were grown for three genera- 
tions to a density of 1 x 10' cells/ml in YAPD medium (YPD plus 0.004% 
adenine) supplemented with 10 mM calcium chloride as described 91 . 
Where indicated. FK506 was added to a Final concentration of 1 ^ig/ml 
0.5 h after inoculation of the culture or to 50 ug/ml 1 h before cells were 
collected. CsA was used at a final concentration of 50 ug/ml. Cells were 
broken by standard procedures" with the following modifications: Cell 
pellets were resuspended in breaking buffer (0.2 M Tris HCI pH 7.6, 0.5 M 
NaCI, 10 mM EDTA. 1% SDS), vonexed for 2 min on a VWR multi-tube 
vonexer at setting 8 in the presence of 60% glass beads (425-600 urn 
mesh; Sigma) and phenolxhloroform (50:50. volume/volume). After sep- 
aration of the phases, the aqueous phase was re-extracted and ethanol- 
precipitated. Poly A' RNA was isolated by two sequential 
chromatographic purifications over oligo dT cellulose (New England 
Biota bs. Beverly, Massachusetts) using established protocols". 

For experiments using 3-AT. wild-type or his3/hi$3 cells were grown to 
early logarithmic phase in SC medium, pelleted and resuspended in SC 
medium lacking histidine for 1 hr in the presence or absence of 10 mM 3- 



AT, as indicated. Cells were harvested and mRNA isolated as above. 
FK506 was obtained from the Swedish Hospital Pharmacy (Seattle, 
Washington) and purified to homogeneity by ethyl acetate extraction by 
J. Simon (Fred Hutchinson Cancer Research Center. Seattle. Washington). 
CsA was obtained from Alexis Biochemicals (San Diego. California); 3-AT 
was from Sigma. 

Preparation and hybridization of the labeled sample. Fluorescentry-la- 
beled cDNA was prepared, purified and hybridized essentially as de- 
scribed 7 . Cy3- or Cy5-dUTP (Amersham) was incorporated into cDNA 
during reverse transcription (Superscript II; Life Technologies) and puri- 
fied by concentrating to less than 10 ul using Microcon-30 microconcen- 
trators (Amicon, Houston, Texas). Paired cDNAs were resuspended in 
20-26 Ml hybridization solution (3 x SSC. 0.75 ug/ml polyA DNA. 0.2% 
SDS) and applied to the microarray under a 22- x 30-mm coverslip for 6 
h at 63 *C. all according to a published method 1 . 

Fabrication and scanning of microarrays. PCR products containing 
common 5' and 3* sequences (Research Genetics. Huntsvilte. Alabama) 
were used as templates with amino-modified forward primer and unmod- 
ified reverse primers to PCR amplify 6.065 ORFs from the S. cerevisiae 
genome. Our first-pass success rate was 94%. Amplification reactions that 
gave products of unexpected sizes were excluded from subsequent analy- 
sis. ORFs that could not be amplified from purchased templates were am- 
plified from genomic DNA. DNA samples from 100-ul reactions were 
isopropanol-precipitated. resuspended in water, brought to a final con- 
centration of 3x SSC in a total volume of 15 ul, and transferred to 384- 
well microtiter plates (Genetix Limited. Christchurch. Dorset, England). 
PCR products were spotted onto 1 x 3-inch polylysine- treated glass slides 
by a robot built essentially according to defined specifications"-' 
(http://cmgm.stanford.edu/pbrown/MGuide). After being printed, slides 
were processed according to published protocols'. 

Microarrays were imaged on a prototype multi-frame CCD camera in 
development at Applied Precision (Issaquah, Washington). Each CCD 
image frame was approximately 2-mm square. Exposure limes of 2 s in 
the Cy5 channel (white light through Chroma 618-648 nm excitation fil- 
ter. Chroma 657-727 nm emission filter) and 1 s in the Cy3 channel 
(Chroma 535-560 nm excitation filter, Chroma 570-620 nm emission fil- 
ter) were done consecutively in each frame before moving to the next, 
spatially contiguous frame. Color isolation between the Cy3 and CyS 
channels was about 100:1 or better. Frames were 'knitted' together in 
software to make the complete images. The intensity of spots (about 100 
^m) were quantified from the 10-um pixels by frame-by-frame back- 
ground subtraction and intensity averaging in each channel. Dynamic 
range of the resulting spot intensities was typically a ratio of 1.000 be- 
tween the brightest spots and the background-subtracted additive error 
level. Normalization between the channels was accomplished by normal- 
izing each channel to the mean intensities of all genes. This procedure is 
nearly equivalent to normalization between channels using the intensity 
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ratio of genomic DNA spots', but is possibly more robust, as it is based on 
the intensities of several thousand spots distributed over the array. 

Signature correlation coefficients and their confidence limits. 
Correlation coefficients between the signature ORFs of various expert- 
menu were calculated using: 

P-lM./aVXy.T* 
k k k 

where x. is the tog* of the expression ratio for the k* gene in the x signa- 
ture, and y> is the tog„ of the expression ratio for the k" gene in the y sig- 
nature. The summation is over those genes that were either up- or 
down-regulated in either experiment at the 95% confidence level. These 
genes each had a less than 5% chance of being actually unregulated (hav- 
ing expression ratios departing from unity due to measurement errors 
alone). This confidence level was assigned based on an error model which 
assigns a lognormal probability distribution to each gene's expression 
ratio with characteristic width based on the observed scatter in its re- 
peated measurements (repeated arrays at the same nominal experimental 
conditions) and on the individual array hybridization quality. This latter 
dependence was derived from control experiments in which both Cy3 
and Cy5 samples were derived from the same RNA sample. For large 
numbers of repeated measurements the error reduces to the observed 
scatter. For a single measurement the error is based on the array quality 
and the spot intensity. 

Random measurement errors in the x and y signatures tend to bias the 
correlation towards zero. In most experiments, most genes are not signif- 
icantly affected but do show small random measurement errors. Selecting 
only the '95% confidence' genes for the correlation calculation, rather 
than the entire genome, reduces this bias and makes the actual biological 
correlations more apparent. 

Correlations between a profile and itself are unity by definition. Error 
limits on the correlation are 95% confidence limits based on the individ- 
uat measurement error bars, and assuming uncorrected errors". They do 
not include the bias mentioned above; thus, a departure of p from unity 
does not necessarily mean that the underlying biological correlation is im- 
perfect. However, a correlation of 0.7 ± 0.1, for example, is very signifi- 
cantly different from zero. Small {magnitude of p < 0.2) but formally 
significant correlation in the tables and text probably are due to small sys- 
tematic biases in the CyS/Cy3 ratios that violate the assumption of inde- 
pendent measurement errors used to generate the 95% confidence 
limits. Therefore, these small correlation values should be treated as not 
significant. A likely source of uncorrected systematic bias is the partially 
corrected scanner detector nonlinearity that differently affects the Cy3 
and Cy5 detection channels. 

The 1 ug/ml FK506 treatment signature was compared with more 
than 40 unrelated deletion mutant strain or drug signatures. These con- 
trol profiles had correlation coefficients with the FK506 profile that were 
distributed around zero (mean p . -0.03) with a standard deviation of 
0.16 (data not shown), and none had correlations greater than p * 0.38. 
Similarly, the calcineurin mutant strain signature correlated well with the 
CsA treatment signature (p . 0.71 i 0.04) but not with the signatures 
from the negative controls (mean p - -0.02 with a standard deviation of 
0.18). 



Quality controls. End-to-end checks on expression ratio measurement 
accuracy were provided by analyzing the variance in repeated hybridiza- 
tions using the same mRNA labeled with both Cy3 and Cy5. and also 
using Cy3 and Cy5 mRNA samples isolated from independent cultures of 
the same nominal strain and conditions. Biases undetected with this pro- 
cedure. such as gene-specific biases presumably due to differential incor- 
poration of Cy3- and CyS-dUTP into cONA. were minimized by doing 
hybridizations in fluor-reversed pairs, in which the Cy3/Cy5 labeling of 
the biological conditions was reversed in one experiment with respect to 
the other. The expression ratio for each gene is then the ratio of ratios be- 
tween the two experiments in the pair. Other biases are removed by algo- 
rithmic numerical de-trending. The magnitude of these biases in the 
absence of de-trending and fluor reversal is typically about 30% in the 
ratio, but may be as high as twofold for some ORFs. 

Expression ratios are based on mean intensities over each spot. Some 
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smaller spots have fewer image pixels in the average. This does not de- 
grade accuracy noticeably until the number of pixets falls below tea In 
which case the spot is rejected from the data set. 'Wander' of spot post, 
twns with respect to the nominal grid is adoptively tracked in array sub- 
regions by the image processing software. Unequal spot 'wander* within 
a subregion greater than half-a-spot spacing is a difficulty for the auto- 
mated quantitating algorithms: in this case, the spot is rejected from 
analysis based on human inspection of the 'wander*. Any spots partially 
overlapping are excluded from the data set. Less than 1% of spots tvoi- 
cally are rejected for these reasons. 
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NC 1 M Nad and 50 mM tris-HCI (pH &0) at 37*C 
for GO mia In the absence of exogenous RNA, neither 
cones nor cylinders formed at concentrations of 0,5 
M Naa or below. Absorption spectra demonstrated 
that our CA-NC preparations were not contaminated 
with tscherkhis co// RNA (estimated tower detection 
limit was -1 base/protein molecule). To control for 
even lower levels of RNA contamination, we prein- 
cubated the CA-NC protein with 0.5 mg/mt ri t>o nu- 
clease A (Type 1-AS. 54 Kurutt U/mg, Sigma) for 1 
hour at 4*C which then formed cones normally. 
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The Transcriptional Program in 
the Response of Human 
Fibroblasts to Serum 

Vishwanath R. Iyer. Michael B. Eisen, Douglas T. Ross 
Greg Schuler. Troy Moore, Jeffrey C. F. Lee. Jeffrey M. Trent, 

Louis M. Staudt, James Hudson Jr., Mark S. Boguski, 
Deval Lashkari, Dari Shalon, David Botstein, Patrick O. Brown* 

The temporal program of gene expression during a model physiological re- 
sponse of human cells, the response of fibroblasts to serum, was explored with 
a complementary DNA microarray representing about 8600 different human 
genes. Genes could be clustered into groups on the basis of their temporal 
patterns of expression in this program. Many features of the transcriptional 
program appeared to be related to the physiology of wound repair, suggesting 
that fibroblasts play a larger and richer role in this complex multicellular 
response than had previously been appreciated. 



The response of mammalian fibroblasts to 
serum has been used as a model for studying 
growth control and cell cycle progression (/). 
Normal human fibroblasts require growth 
factors for proliferation in culture; these 
growth factors are usually provided by fetal 
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bovine serum (FBS). In the absence of 
growth factors, fibroblasts enter a nondivid- 
ing state, termed G 0 , characterized by low 



metabolic activity. Addition of FBS or puri- 
fied growth factors induces proliferation of 
the fibroblasts; the changes in gene expres- 
sion that accompany this proliferative re- 
sponse have been the subject of many studies, 
and the responses of dozens of genes to se- 
rum have been characterized. 

We took a fresh look at the response of 
human fibroblasts to serum, using cDNA mi- 
croaxrays representing about 8600 distinct hu- 
man genes to observe the temporal program of 
transcription that underlies this response. Pri- 
mary cultured fibroblasts from human neonatal 
foreskin were induced to enter a quiescent state 
by serum deprivation for 48 hours and then 
stimulated by addition of medium containing 
10% FBS (2). DNA microarray hybridization 
was used to measure the temporal changes in 
mRNA levels of 8613 human genes (3) at 12 
times, ranging from 15 min to 24 hours after 
serum stimulation. The cDNA made from pu- 
rified mRNA from each sample was labeled 
with the fluorescent dye Cy5 and mixed with a 
common reference probe consisting of cDNA 
made from purified mRNA from the quiescent 
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Fig. 1. The same section of 
the microarray is shown 
for three independent hy- 
bridizations comparing RNA 
isolated at the 8-hour time 
point after serum treat- 
ment to RNA from serum- 
deprived cells. Each mi- 
croarray contained 9996 
elements, including 9804 
human cDNAs. represent- 
ing 8613 different genes. 
mRNA from serum-de- 
prived cells was used to 
prepare cDNA labeled with 
Cy3-deoxyuridine triphosphate (dUTP). and mRNA harvested from cells at different times after serum 
st.mulat.on was used to prepare cDNA labeled with Cy5-dUTP. The two cDNA probes ^ wire mtxeT^ 
s.multaneously hybridized to the microarray. The image of the subs^uems^ 
are ™* abundant in the serum-deprived fibroblasts (that ^surT^ 

spots YeUow spots represent genes whose expression does not vary substantially betweeTm! two 

related protein P5; 2. IL-8 precursor; 3. EST AAOS7170; and 4. vascular endothelial growth factor 
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culture (time zero) labeled with a second fluo- 
rescent dye, Cy3 (4). The color images of the 
hybridization results (Fig. 1) were made by 
representing the Cy3 fluorescent image as 
green and the Cy5 fluorescent image as red and 
merging the two color images. 

Diverse temporal profiles of gene expres- 
sion could be seen among the 8613 genes sur- 



Rg. 2. Ouster image 
showing the different 
classes of gene expres- 
sion profiles. F'rve hun- 
dred seventeen genes 
whose mRNA levels 
changed in response to 
serum stimulation were 
selected (7). This sub- 
set of genes was clus- 
tered hierarchically into 
groups on the basis of 
the similarity of their 
expression profiles by 
the procedure of Eisen 
ef a/. (£). The expres- 
sion pattern of each 
gene in this set is dis- 
played here as a hori- 
zontal strip. For each 
gene, the ratio of 
mRNA levels in fibro- 
blasts at the indicat- 
ed time after serum 
stimulation ("unsync" 
denotes exponentially 
growing ceils) to its 
level in the serum-de- 
prived (time zero) fi- 
broblasts is represented 
by a color, according to 
the color scale at the 
bottom. The graphs 
show the average ex- 
pression profiles for the 
genes in the corre- 
sponding "duster" (in- 
dicated by the letters A 
to J and color coding). 
In every case examined, 
when a gene was rep- 
resented by more than 
one array element the 
multiple representa- 
tions in this set were 
seen to have identical 
or very similar expres- 
sion profiles, and the 
profiles corresponding 
to these independent 
measurements clus- 
tered either adjacent 
or very dose to each 
other, pointing to the 
robustness of the dus- 
tering algorithm in 
grouping genes with 
very similar patterns of 
expression. 
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veyed in this experiment (Fig. 2); many of these 
genes (about half) were unnamed expressed 
sequence tags (ESTs) (S). Although diverse 
patterns of expression were observed the order- 
ly choreography of the expression program be- 
came apparent when the results were analyzed 
by a clustering and display method developed 
in our laboratory for analyzing genome-wide 
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gene expression data (6). An example of such 
an analysis, here applied to a subset of 5I7 
genes whose expression changed substantially 
in response to serum (7), is shown in Fig. 2. 
The entire detailed data set underlying Fig. 
2 is available as a tab-delimited table (in 
cluster order) at the Science Web site (www. 
sciencemag.cirg, / feature/data/9g4559.shJ). In 
addition, the entire, larger data set for the 
complete set of genes analyzed in this exper- 
iment can be found at a Web site maintained 
by our laboratory (genome-www.stanford. 
edu/serum) (8). 

One measure of the reliability of the 
changes we observed is inherent in the ex- 
pression profiles of the genes. For most genes 
whose expression levels changed, we could 
see a gradual change over a few time points, 
which thus effectively provided independent 
measurements for almost all of the observa- 
tions. An additional check was provided by 
the inclusion of duplicate and, in a few cases, 
multiple array elements representing the 
same gene for about 5% of the genes included 
in this microarray. In addition, three indepen- 
dent hybridizations to different microarrays 
with mRNA samples from cells harvested 8 
hours after serum addition showed good cor- 
relation (Fig. I). As an independent test, we 
measured the expression levels of several 
genes using the TaqMan 5' nuclease fluori- 
genic quantitative polymerase chain reaction 
(PCR) assay (P). The expression profiles of 
the genes, as measured by these two indepen- 
dent methods, were very similar (Fig. 3) (JO). 

The transcriptional response of fibroblasts 
to scrum was extremely rapid. The immediate 
response to serum stimulation was dominated 
by genes that encode transcription factors 
and other proteins involved in signal trans- 
duction. The mRNAs for several genes [in- 
cluding c-FOS, JUN B, and mitogen-acti- 
vatcd protein (MAP) kinase phosphatase- 1 
(MKPl)] were detectably induced within 
15 min after serum stimulation (Fig. 4, A 
and B). Fifteen of the genes that were 
observed to be induced by serum encode 
known or suspected regulators of transcrip- 
tion (Fig. 4B). All but one were immediate- 
early genes — their induction was not inhib- 
ited by cycloheximide (//). This class of 
genes could be distinguished into those 
whose induction was transient (Fig. 2, clus- 
ter E) and those whose mRNA levels re- 
mained induced for much longer (Fig. 2, 
clusters I and J). Some features of the 
immediate response appeared to be directed 
at adaptation to the initiating signals. We 
observed a marked induction of mRNA 
encoding MKPl, a dual-specificity phos- 
phatase that modulates the activity of the 
ERKI and ERK.2 MAP kinases {12). The 
coincidence of the peak of expression of 
genes in cluster E (Fig. 2) with that of 
MKPl (Fig. 4A) suggests the possibility 
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that continued activity of the MAP kinase path- 
way is required to maintain induction of these 
genes but not of those with sustained expression 
(clusters I and J). The gene encoding a second 
member of the dual-specificity MAP kinase 
phosphatase family, known as dual -specificity 
protein phosphatase 6/pyst2, was induced later, 
at about 4 hours after serum stimulation. Genes 
encoding diverse other proteins with roles in 
signal transduction, ranging from cell-surface 
receptors [for example, the sphingosine 1- 
phosphate receptor (EDG-I), the vascular en- 
dothelial growth factor receptor, and the type II 
BMP receptor] to regulators of G-protein sig- 
naling (for example, NETl/pl 15 rho GEF) to 
DN A -binding transcription factors, were in- 
duced by serum (Fig. 4A). 

The reprogramming of the regulatory cir- 
cuits in response to serum involved not only 
induction of transcription factors but also re- 
duced expression of many transcriptional reg- 
ulators — some of which may play roles in 
maintaining the cells in G 0 or in priming 
them to react to wounding (Fig. AC), Perhaps 
as a consequence of the historical focus on 
genes induced by serum stimulation of fibro- 
blasts, the set of transcription factors whose 
expression diminished upon serum stimula- 
tion has been less well characterized. 

Genes known or likely to be involved in 
controlling and mediating the proliferative re- 
sponse showed distinctive patterns of regula- 
tion. Several genes whose products inhibit pro- 
gression of the cell-division cycle, such as p27 
Kipl, p57 Kip2, and p!8, were expressed in the 
quiescent fibroblasts and down-regulated be- 
fore the onset of cell division. The nadir in the 
mRNA levels for these genes occurred between 
6 and 12 hours after serum stimulation (Fig. 
5A), coincident with the passage of the fibro- 
blasts through G,. The levels of the transcript 
encoding the WEE 1 -like protein kinase, which 
is believed to inhibit mitosis by phosphoryl- 
ation of Cdc2. diminished between 4 and 8 to 
12 hours after serum addition (Fig. 5 A), well 
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before the onset ofM phase at around 16 hours, 
raising the possibility of an additional role for 
Weel in an earlier stage of the cell cycle or in 
regulating the G 0 to G, transition.' Several 
genes induced in the first few hours after serum 
stimulation, such as the helix-loop-helix pro- 
teins ID2 and ID3 and EST AA016305. a gene 
with homology to G r S cydins, are candidates 
for roles in promoting the exit from G,,. 

Genes involved in mediating progression 
through the cell cycle were characterized by a 
distinctive pattern of expression (Fig. 2, clus- 
ter D), reflecting the coincidence of their 
expression with the reentry of the stimulated 
fibroblasts into the cell-division cycle. The 
stimulated fibroblasts replicated their DNA 
about 16 hours after serum treatment. This 
timing was reflected by the induction of 
mRNA encoding both subunits of ribonucle- 
otide reductase and PCNA, the processivity 
factor for DNA polymerase epsilon and delta. 
Cyclin A, Cyclin Bl, Cdc2, and CDC28 ki- 
nase, regulators of passage through the S 
phase and the transition from G 2 to M phase, 
were induced at about 16 to 20 hours after 
serum addition. The kinase in the Cyclin 
Bl-CDK pair needs to be activated by phos- 
phorylation. The gene encoding Cyclin-de- 
pendent kinase 7 (CDK7: a homolog of Xe- 
nopus MO 1 5 cdk-activating kinase) was in- 
duced in parallel with the Cdc2 and Cdc28 
kinases (Fig. 5A), suggesting a potential role 
for CDK7 in mediating M phase. DNA topo- 
lsomerase II a, required for chromosome seg- 
regation at mitosis; Mad2, a component of 
the spindle checkpoint that prevents comple- 
tion of mitosis (anaphase) if chromosomes 
are not attached to the spindle; and the kinet- 
ochore protein CENP-F all showed a similar 
expression profile. 

In the hours after the scrum stimulus, one of 
the most striking features of the unfolding tran- 
scriptional program was the appearance of nu- 
merous genes with known roles in processes 
relevant to the physiology of wound healing. 



These included both genes involved in the di- 
rect role played by fibroblasts in rernodeiing of 
me clot and the extracellular matrix and, more 
notably, genes encoding proteins involved in 
intercellular signaling (Fig. 5). Genes induced 
m this program encode products that can (i) 
participate in the dynamic process of clotting, 
clot dissolution, and remodeling and perhaps 
contnbute to hemostasis by promoting local 
vasoconstriction (for example, endothelin-I); 
(n) promote chemotaxis and activation of neu- 
trophils (for example, COX2) and recruitment 
and extravasation of monocytes and macro- 
phages (for example, MCPI); („i) promote 
chemotaxis and activation of T lymphocytes 
[for example, interleukin-8 (IL-8)] and B 
lymphocytes (for example, ICAM-1), thus 
providing both innate and antigen-specific 
defenses against wound infection and recruit- 
ing the phagocytic cells that will be required 
to clear out the debris during remodeling of 
the wound; (iv) promote angiogenesis and 
neovascularization (for example, VEGF) 
through newly forming tissue; (v) promote 
migration and proliferation of fibroblasts (for 
example. CTGF) and their differentiation into 
myofibroblasts (for example, Vimentin); and 
(vi) promote migration and proliferation of 
keratinocytes, leading to reepithelialization 
of the wound (for example, FGF7), and pro- 
mote proliferation of melanocytes, perhaps 
contributing to wound hyperpigmentation 
(for example, FGF2). 

Coordinated regulation of groups of genes 
whose products act at different steps in a 
common process was a recurring theme. For 
example, Furin, a prohormone-processing 
protease required for one of the processing 
steps in the generation of active endothelin 
was induced in parallel with induction of the 
gene encoding the precursor of endothelin-I 
(Fig. 5E) (13). Conversely, expression of 
CALL A/CD 10. a membrane metal lopro tease 
that degrades endothelin- 1 and other peptide 
mediators of acute inflammation, was re- 
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duced A second example is provided by a set 
of five genes involved in the biosynthesis of 
cholesterol (Fig. 51). The mRNAs encoding 
each of these enzymes showed sharply dimin- 
ished expression beginning 4 to 6 hours after 
serum stimulation of fibroblasts. A likely ex- 
planation for the coordinated down-regula- 
tion of the cholesterol biosynthetic pathway 
is that serum provides cholesterol to fibro- 
blasts through low-density lipoproteins, 
whereas in the absence of the cholesterol 
provided by serum, endogenous cholesterol 
biosynthesis in fibroblasts is required. 

Many of the previously studied genes that 
we observed to be regulated in this program 
have no recognized role in any aspect of wound 
healing or fibroblast proliferation. Their identi- 
fication in this study may therefore point to 
previously unknown aspects of these processes. 
A few selected genes in this group are shown in 
Fig. 5H. The stanniocalcin gene, for example 
(Fig. 5H), encodes a secreted protein without a 
clearly identified function in human cells (14, 
15). Its induction in serum-stimulated fibro- 
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Fig. 4. "Reprogramming" of fibroblasts. Expres- 
sion profiles of genes whose function is likely to 
play a role in the reprogramming phase of the 
response are shown with the same representa- 
tion as in Fig. 2. In the cases in which a gene 
was represented by more than one element in 
the microarray, all measurements are shown. 
The genes were grouped into categories on the 
basis of our knowledge of their most likely role. 
Some genes with pleiotropic roles were includ- 
ed in more than one category. 



REPORTS 

blasts suggests the possibility that it may play a 
role in the wound-healing process, perhaps 
serving as a signal in mediating inflammation 
or angiogenesis. 

One of the most important results of this 
exploration was the discovery of over 200 pre- 
viously unknown genes whose expression was 
regulated in specific temporal patterns during 
the response of fibroblasts to scrum. For exam- 
ple, 13 of the 40 genes in cluster D (Fig. 2) have 
descriptive names that reflect their putative 
function. Nine of these 13 genes (69%) encode 
proteins that play roles in cell cycle progres- 
sion, particularly in DNA replication and the 
G 2 -M transition. This enrichment for cell 
cycle-related genes suggests that some of the 



unnamed genes in this cluster— for example, 
EST W793I1 and EST R 13 146, neither of 
which have sequence similarity to previously 
characterized genes— may represent previously 
unknown genes involved in this part of the cell 
cycle. Similarly, a remarkable fraction of genes 
that were grouped into cluster F on the basts of 
their expression profiles encoded proteins in- 
volved in intercellular signaling (Fig. 2\ sug- 
gesting that a similar role should be considered 
for the many unnamed genes in this cluster. A 
disproportionately large fraction of the genes 
whose transcription diminished upon serum 
stimulation were unnamed ESTs. 

Our intention was to use this experiment as 
a model to study the control of the transition 




Fig. 5. The transcr.pt.onal response to serum suggests a multifaceted role for fibroblasts in th* 
phys.ology of wound healing. The features of the transcriptional program of fibroblast nTesDon* 
to serum st.mulat.on that appear to be related to various aspects of the wowX^iroS^ 
f.broblast prol.ferat.on are shown with the same convention for representing chants 
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from Go to a proliferating state. However, one 
of the defining characteristics of genome-scale 
expression profiling experiments is that the ex- 
amination of so many diverse genes opens a 
window on all the processes that actually occur 
and not merely the single process one intended 
to observe. Serum, the soluble fraction of clot- 
ted blood, is normally encountered by cells in 
vivo in the context of a wound Indeed, the 
expression program that we observed in re- 
sponse to serum suggests that fibroblasts are 
programmed to interpret the abrupt exposure to 
serum not as a general mitogenic stimulus but 
as a specific physiological signal, signifying a 
wound. The proliferative response that we orig- 
inally intended to study appeared to be pan of a 
larger physiological response of fibroblasts to a 
wound Other features of the transcriptional 
response to serum suggest that the fibroblast is 
an active participant in a conversation among 
the diverse cells that work together in wound 
repair, inteipreting, amplifying, modifying, and 
broadcasting signals controlling inflammation, 
angiogenesis, and epithelial regrowth during 
we response to an injury. 

We recognize that these in vitro results 
almost certainly represent a distorted and in- 
complete rendering of the normal physiolog- 
ical response of a fibroblast to a wound 
Moreover, only the responses elicited directly 
by exposure of fibroblasts to serum were 
examined. The subsequent signals from other 
cellular participants in the normal wound- 
healing process would certainly provoke fur- 
ther evolution of the transcriptional program 
in fibroblasts at the site of a wound, which 
this experiment cannot reveal. Nevertheless 
we believe that the picture that emerged 
strongly suggests a much larger and richer 
role for the fibroblast in the orchestration of 
this important physiological process than had 
previously been suspected 
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Systematic variation in gene expression 
patterns in human cancer cell lines 
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We u 5 ed cDNA microarrays to explore the variation in expression of approximately 8,000 unique genes among the 
60 cell lines used in the National Cancer Institute's screen for anti-cancer drugs. Classification of the cell lines based 
solely on the observed patterns of gene expression revealed a correspondence to the ostensible origins of the 
tumours from which the cell lines were derived. The consistent relationship between the gene expression patterns 
and the tissue of origin allowed us to recognize outliers whose previous classification appeared incorrect Specific 
features of the gene expression patterns appeared to be related to physiological properties of the cell lines such 
as their doubling time in culture, drug metabolism or the interferon response. Comparison of gene expression pat- 
terns in the cell lines to those observed in normal breast tissue or in breast tumour specimens revealed features of 
the expression patterns in the tumours that had recognizable counterparts in specific cell lines, reflecting the 
tumour, stromal and inflammatory components of the tumour tissue. These results provided a novel molecular 
characterization of this important group of human cell lines and their relationships to tumours in vivo 



Introduction 

Cell lines derived from human tumours have been extensively used 
as experimental models of neoplastic disease. Although such cell 
lines differ from both normal and cancerous tissue, the inaccessi- 
bility of human tumours and normal tissue makes it likely that 
such cell lines will continue to be used as experimental models for 
the foreseeable future. The National Cancer Institute's Develop- 
mental Therapeutics Program (DTP) has carried out intensive 
studies of 60 cancer cell lines (the NC160) derived from tumours 
from a variety of tissues and organs M . The DTP has assessed many 
molecular features of the cells related to cancer and chemothcra- 
peutic sensitivity, and has measured the sensitivities of these 60 cell 
lines to more than 70,000 different chemical compounds, includ- 
ing all common chemotherapeutics (http://dtp.nci.nih.gov). A 
previous analysis of these data revealed a connection between the 
pattern of activity of a drug and its method of action. In particular, 
there was a tendency for groups of drugs with similar patterns of 
activity to have related methods of action 3 * 5 " 7 . 

We used DNA microarrays to survey the variation in abun- 
dance of approximately 8,000 distinct human transcripts in these 
60 cell lines. Because of the logical connection between the func- 
tion of a gene and its pattern of expression, the correlation of gene 
expression patterns with the variation in the phenorype of the cell 
can begin the process by which the function of a gene can be 
inferred. Similarly, the patterns of expression of known genes can 



reveal novel phenorypic aspects of the cells and tissues studied*- 10 . 
Here we present an analysis of the observed patterns of gene 
expression and their relationship to phenorypic properties of the 
60 cell lines. The accompanying report 11 explores the relationship 
between the gene expression patterns and the drug sensitivity pro- 
files measured by the DTP. The assessment of gene expression pat- 
terns in a multitude of ceD and tissue types, such as the diverse set 
of cell lines we studied here, under diverse conditions in virro and 
m vivo, should lead to increasingly detailed maps of the human 
gene expression program and provide clues as to the physiological 
roles of uncharacterized genes 11 " 16 . The databases, plus tools for 
analysis and visualization of the data, are available (http://genomc- 
www.stanford.edu/nci60 and http://discover.nci.nih.gov). 

Results 

We studied gene expression in the 60 cell lines using DNA 
microarrays prepared by robotically spotting 9,703 human 
cDNAs on glass microscope slides 17 - 18 . The cDNAs included 
approximately 8,000 different genes: approximately 3,700 repre- 
sented previously characterized human proteins, an additional 
1,900 had homologues in other organisms and the remaining 
2 ,400 were identified only by ESTs. Due to ambiguity of the iden- 
tity of the cDNA clones used in these studies, we estimated that 
approximately 80% of the genes in these experiments were cor- 
rectly identified. The identities of approximately 3,000 cDNAs 



D,/»r^ s Laboratory of 

Molecular Pharmacology, Division of Baste Sconces, National Cancer Institute, National Institutes of Health, B the da. £^Ja^SA^e 
Pharmaceuticals, Fremont, California, USA. 'Genometrix Inc Thp W/v»WJ*„Wr t- hca m / ■»- . ™«^anfl, uoa. incyte 



nature genetics • volume 24 • march 2000 



227 



article 



«• 2000 Nature America Inc.- http^genetic^naturexom 




• -0JD 

OJO 
IjOO 




1161 
genes 



BIT' 



CNS 



renal 



warian leukaemia colon 



melanoma 



Hg. 1 Gene «prassion patterns related to the tiuue of origin of the »n ii n ~ t ~. 
sion.1 hierarchic., clustering .pp lied to ^^^^T^^SSi 
meaiured acrou 64 cell lines. The 1,161 cDNAs were thn«. i„r t,„ „ - 6 ' cDNA > 
lev.* that varied by at ^^m^vw^^^^^^'"^ 
least 4 of 60 cell link This effectively »el«ed oe^h n V. . p °° l m " 

^'"osstneSOcellline,^^ 

^^^^ 

weighted for the gene clustering so tr», each^e M cell ,T " rreipon<,in 9'y 

r.l«ion .efficient reprinted by the fengttoft. cVn*oo?a;Trc h tePiCU ^ 
pairs of node,. Not. tha, the two tripte* o (repeated «STs6 2 .n^MrmT' n9 
tightly together and were well differentiated from e»Vn theZ!,l.^ . C "> 
indicating tha, ,h« clunering of cell line, i, based" nX,A.Ti« , t 
e«pre»ion pattern, rather than anefao. of tnVexr^nental 1 1'" ** 9ene 

representation of the data table with the row. - P ,ot « du '«- »• * coloured 

order. The ^rog^X^^^^lZZ ™ lineS, C "" ,e ' 
ted for clarity, but is available flmp*UoJt^7£nZ^ * '"T 1 ° mi *- 
cell of thi. table ref lea. th. ^J^l^Z^t^Tn , * 'f^' " "* 
(column). The colour scale tised to r«m»m * 9ene <ro "' ) ,nd «" lin e 

3-3d-in ^rr^s^rc - ,hown - The ,abeb 



3a 



13b 



3c 



1 3d 



cell lines 



from these experiments have been sequence-verified, includine 
alJ of those referred to here by name. 

Each hybridization compared Cy5-label)ed cDNA reverse tran- 
senbed from mRNA isolated from one of the cell lines with Cy3- 
labelled cDNA reverse transcribed from a reference mRNA 
sample This reference sample, used in all hybridizations, was 
prepared by combining an equal mixture of mRNA from 12 of 
the cell lines (chosen to maximize diversity in gene expression as 
determined primarily from two-dimensional gel studies 3 ) Bv 
comparing cDNA from each cell line with a common reference 
yanation m gene expression across the 60 cell lines could be 
interred from the observed variation in the normalized Cy5/Cv3 
rauos across the hybridizations. 

To assess the contribution of artefactual sources of variation in 
!^rS tpe !? memaBy measured expression patterns, K562 and 
MCF7 cell lines were each grown in three independent cultures 
and the enure process was carried out independently on mRNA 
extracted from each culture. The variance in the triplicate fluo- 
rescence ratio measurements approached a minimum when the 
fluorescence signal was greater than approximately 0.4% of the 
measurable total signal dynamic range above background in 
either channel of the hybridization. We selected the subset of 
spots for which significant signal was present in both the numer- 
ator and denominator of the ratios by this criterion to identify 
the best-measured spots. The pair-wise correlation coefficients 
for the triplicates of the set of genes that passed this quality con- 
trol level (6,992 spots included for the MCF7 samples and 6 161 
spots for K562) ranged from 0.83 to 0.92 (for graphs and details 
see http://genome-wwwjtanford.edu/nci60) 

To make the orderly features in the data more apparent, we used 
a hierarchical clustering algorithm'™ and a pseudo-colour visu- 



abzauon matra"'. The object of the clustering was to group cell 
bnes w«h similar repertoires of expressed genes and 8 ,o 
genes whose expresaon level varied among the 60 cell line-Tin a 
similar manner. Clustering was performed twice using different 
ubsets of genes to assess the robustness of the analysis. In one case 
(Fig. 1), we concentrated on those genes that showed the most 
variation in expression among the 60 cell lines ( 1.167 total). A sec- 
ond analysis (fig. 2) included all spots that were thought to be well 
measured in the reference set (6,83 1 spots). 
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Gene expression patterns related to the histologic 
origins of the cell lines ",»i°iogic 

The most notable property of the clustered data was that cell lines 
war. common presumptive tissues of origin grouped together 
Figs . and 2). Cell lines derived from leukaLa? melasma 
central nervous system, colon, renal and ovarian tissue were clus- 
tered .mo independent terminal branches specific to their respec- 
tive organ types wi ,h few exceptions. Cell lines derived from 
non-smaU lung carcinoma and breast tumours were distributed 
■n multiple different terminal branches suggesting that their gene 
expression patterns were more heterogeneous. 
Many of these coherent cell line clusters were distinguished by 

fL wf'c "^^T °, f Charaneris,ic g«>ups of g<nes 
( Hg 3fl-rf). For example, a cluster of approximately 90 genes was 
highly expressed in the melanoma-derived lines (Fig 3c) This set 
was enriched for genes with known roles in melanocyte biology, 
including tyrosinase and dopachrome tautomerase (TYR and 
T : '"o f ubu ™« of an enzyme complex involved in melanin 
synthesis"), MARTI (MLANA; which is being investigated as a 
1*1 u mmwolheia PY of melanoma") and SlOO-jj (S100B- 
which has been used as an antigenic marker in the diagnosis of 
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«©. 2 Gene expression patterns related to 
other cell-fine phenotypes, a. We applied 
two-dimensional hierarchical clustering to 
expression data from a set of 6,831 cONAs 
measured across the 64 cell lines. The 6,831 
cDNAs were those with a minimum fluores- 
cence signal intensity of approximately 0.4% 
of the dynamic range above background in 
the reference channel in each of the six 
hybridizations used to establish reproducibil- 
ity. This effectively selected those spots that 
provided the most reliable ratio measure- 
ments and therefore identified a subset of 
genes useful for exploring patterns comprised 
of those whose variation in expression across 
the 60 cell lines was of moderate magnitude. 
b, Cluster-ordered data table, c, Doubling 
time of cell lines. Cell lines are given in duster 
order. Values are plotted relative to the mean. 
Doubling times greater than the mean are 
shown in green, those with doubling time less 
than the mean are shown in red. of. Three 
related gene clusters that were enriched for 
genes whose expression level variation was 
correlated with cell line proliferation rate. 
Each of the three gene clusters (clustered 
solely on the basis of their expression pat- 
terns) showed enrichment for sets of genes 
involved in distinct functional categories (for 
example, ribosomal genes versus genes 
involved in pre-RNA splicing), e, Gene cluster 
in which all characterized and sequence-veri- 
fied cDNAs encode genes known to be regu- 
lated by interferons, /, Gene cluster enriched 
for genes that have been implicated in drug 
metabolism (indicated by asterisks). A further 
property of the gene clustering evident here 
and in Fig. 2 is the strong tendency for redun- 
dant representations of the same gene to 
cluster immediately adjacent to one another, 
even within larger groups of genes with very 
similar expression patterns. In addition to 
illustrating the reproducibility and consis- 
tency of the measurements, and providing 
independent confirmation of many of our 
measurements, this property also demon- 
strates that these, and probably all, genes 
have nearly unique patterns of variation 
across the 60 cell lines. If this were not the 
case, and multiple genes had identical pat- 
terns of variation, we would not expect to be 
able to distinguish, by clustering on the basis 
of expression variation, duplicate copies of 
individual genes from the other genes with 
identical expression patterns. 
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in |he NC160. did not show this characteristic partem. Although 
isolated from a patient with melanoma, LOXIMV1 has previously 
been noted to lack melanin and other markers useful for identifi- 
cation of melanoma cells'. 
Paradoxically two related cell lines (MDA-MB435 and MDA- 
j which were derrved from a single patient with breast cancer 
and have been conventionally regarded as breast cancer cell lines, 
shared Expression of the genes associated with melanoma. MDA- 
MB435 was isolated from a pleural effusion in a patient with 
metastatic ductal adenocarcinoma of the breast 2 * 25 It remains 
possible that the origin ofthe cell line was a breast cancer, and that 
its gene expression pattern is related to the neuroendocrine fea- 
tures of some breast cancers 26 . But our results suggest that this cell 
line may have ongmated from a melanoma, raising the possibility 
that the pauent had a co-existing occult melanoma. 

The higher-level organization of the cell-line tree— in which 
groups span cell lines from different tissue types-also reflected 
shared biological properties of the tissues from which the cell 
i^T ^ ed f nwd ;^' car « n o«»-<lerrved cell lines were divided 
into major branches that separated those that expressed genes 
character^ of epithelial cells from those that expressed «n« 
more typical of stromal cells. A cluster of genes is shown (Fig 3« 
that is most strongly expressed in cell lines derived from colon 
carcinomas, six of seven ovarian-derived cell lines and the two 
breast cancer lines positive for the oestrogen receptor. The named 

epiAebalceD biology". The duster was enriched for geneSoi 

P JZtZ "'i"™" l0C ^f ,0 « he ^lateral membraTe of 
epithelial cells, including those encoding components of 
adherens complexes (for example, desmoplakin (DSP) 
penplakm PPL and plakoglobin (JUP)). an epical 
expressed celi-ceD adhesion molecule (M4S1) and a Cum/ 
hydrogen ion exchanger 2 "' (SLC9A1). 1, also contained gen« 
that encode pu tative transcriptional regulators of epithelial mor- 

" T h0n,0l0 « Ue of 8 Vrosophila mclanogaster 
epithelial-expressed tumour suppressor (LLGL1) and a homeo- 

^K 8 f n f ,0 C ° ntr01 caldu ™-™°i«ed adherencTm 

epithehal cells 32 -" (MSX2). »««ice in 

In contrast, a separate, major branch ofthe cell-line dendro- 
gram (F,g. U) included all glioblastoma-derived cell iS all 

Ce " *»* and th < ^"iningTa'rci 
noma-derrved hnes. The characteristic set of genes exm-eL-H in 
this duster induded many whose products are invoE So- 
ma! cell funcuons (Fig id). Indeed, the two cell lines origin! 
described as sarcoma-like in appearance (Hs578T, breast carci- 
nosarcoma and SF539. gliosarcoma) expressed most of these 
genes ■» . Although no smgle gene was uniformly characteristic 
of this duster, each cell line showed a distinctive panern of 
expression of genes encoding proteins with roles in synthesis or 
mod,ficaUo n of the extracellular matrix (for example, caldesmon 
mvf Cathe P s,ns > thrombospondin (THBS), lysyl oxidase 

"n C n^ 8e " A,th0U 6 h Parian and most 

non-small-cell-lung-derrved carcinomas expressed genes charar 
teristic of both epithelial cells and stromal cells, th^bZ 
clustered with the CNS and renal cell carcinomas in this analysis 
because genes characteristically expressed in stromal cells were 
more abundantly represented in this gene set. 



res^'' *!• ? riation m «P«s«on levels may reflect cor- 

SEE,? 7 10 P^tmg cells 

Wrthin this large duster were smaller dusters enriched for genes 
with more specialized roles. One duster was highly^chedT 
numerous nbosomal genes, whereas another vi^oTenrSS 
for genes encoding RNA-splicing factors. The Son £ 
expression of these ribosomal genes was sigruficantlTc^ekted 
with v™ „ ^ ce u doublin dme iJ^SSSS!i 

t^LZFT- n0ti ° n ^ Ae « en « » clusteTwere 

In a smaller gene duster (Fig. 2d), all of the named genes were 
previously known to be regulated by interferons" •» AddTtioS 
groups of interferon-regulated genes showed distmct p«3 of 



Physiological variation reflected 
in gene expression patterns 

dut U ^f dia8ram ? 6,831 - 8eneS (F * 2) " USeftl, fo ' "Paring 
clusters of genes whose variation ,n mRNA levels was not obvi- 
ously attributable to cell or tissue type. We identified some gene 
clusters that were enriched for genes involved in specific cellular 
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Cell lines facilitate interpretation of gene exDre»i»n 
patterns in complex clinical samples express,on 

Like many other types of cancer, tumours ofthe breast typically 
have a complex histological organization, with conSS 

exlr?^ £r ,ra, r imerWVen tumour S £ 
explore the poss.bility that variation in gene expression in t£ 
tumour cell lines might provide a framework <bSSSS £ 

plated from two breast cancer biopsy samples, a sample of nor- 
mal breast tissue and the NC160 cell lines derived from breast 
"" CC " ( " dud , ln 6 MDA-MB-435 and MDA-N) and leuTaS 
Fig. 4). This clustering highlighted features of the gene «mT 

cetj lines derived from breast cancers and leukaemias 

h f, S enes encoding keratin 8 (KRT8) and keratin 19 (KRTl 91 

Te NCiaTce'll 0 ? ^ CPitheliar B ™ S • 

«m i ? a i ' 'J"*"' Were ^ m both ofthe biopsy 
samples and the two breas.-derived cell lines. MCF-7 and T47D 
expressing the oestrogen receptor, suggesting that £e tran 
scripts originated in tumour cells with futures similaMoTho«of 
lumina epnhehal cells (Fig. 5a). Expression of a set of glnes char- 

(TAcfL ? A1) 3nd Sm °° th musde «U rnarkers 

J^k™:> {eatUre Sh3red by the tumour "-"pie andThe 
strcmal-Lke cell hnes Hs578T and BT549 (Fig. 5b). This featurl 
ofthe expression panern seen in the tumour Lmplelis likdv t " 

also sha ed ^ T^"™ ° f ,he ,UmOUr Tbe tumours 
also shared expression of a set of genes (Fig. 5c) with the mulup e 
myeloma cell hne (RPM1-8226). notably indudfng 
mmunoglobuhn genes, consistent with the presence of B cells 
>n the tumour (th.s was confirmed by staining with ami- 
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«9- 3 Gene clusters related to tissue characteristics in th* rmii r , 
for genes expressed in cell lines of ostensibly similar origins 1 n ' a / 9eme " ts of ,he '«9«n» of the cluster diagram in Fig , showino Q#n . rl # 

genes that were expressed in most leukaemia^erlved C 'rorl 111 ! ^ h ' 9h ' y eXprC " ed in the 'eukaemia-Serived eel I nesTwc?^ C ' mte " en ' ,ch «* 
tions Custer together), t, Cluster of genes highly ^T^J^^ni^V^ " dusive * in «* eryroblastoid Hne. K5U ^ note th7t ^th. n " d / S,m 9 uish 
set of genes was. (so moderately expressed in mos ovaHan I« 5 m ^ a ° d a " «»'«-«-oerhred cell lines p^^t^^uJl'^ 19 

cer^derived lines, c. Cluster of genes highly wlVd nZ™ * * l ° me noo ^^'<ell-lung (4/6) lines, but wa?e«?««d ^t /in 9 ? pl ° r <2/2) * Thl$ 
MB43S and MDA-N). Cluster of genes hJE ""^^ « m J a "^^e rived lines (6/7) and two related lines o«.*7^^ ^ ,n a " 

moderately expressed in a subset of carcinoma!'^ t6/6) lin " and lines derived from 

verified by sequencing. The number of se q u^Z^T^[ f are $hown on| y f °' •» ""own genes whole !dem , . i ^ m ° re 

adjacent list only approximates their pos^o" in the Zt r llZ* ^ ^ low c.«,ei in^arentheses The oo^on ^ ' ndependent * '** 

..ageswithallger^namesandaccessionnumbe:::^^^^ 
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ipcdnicnt and cultured brein cancer and leukaemia all lines, a 1**L*Z~Z 
.-on.. hierarchical cl«t.,ing applied to gene e«pre^on o^fw^t^ 
ipecim.m. , tymph nod. met«T» from^^ ^ 
.nd the NC ,„ ind ..uk^^^ cell Z^^n^l^ 
date from tnsoe specimen, w« cluttered .long with ««pr d^Tw^ 
ubt « «"« NO60 cell line, to explore J^7Z!£^!££Z! 

ob »™« "> H-cifk line, could be identified inCt BU , 
.nd.cate gene cluster, (shown in deuil in Fig. 5) thn mavr^^^J^ 
celular component of th. tumour .peomW 

n.,ned with an,i. k „„ in arrtibodi*. ,howing tht .SStaS oTc^vol! 
characteristically found in bre.it tumoun. The arrow^M^ Z. 
c.Nula, component, of this tiswe ^n tt^^e Su^'^TSl 
gene e.preuion cluster analpis (Fig. 5). Oshnguahed by the 



immunoglobulin antibodies; data not shown). Therefore dis- 
tinct sets of genes with co-varying expression among the samples 
(Fig. 4, arrow) appear to represent distinct eel] types that can be 
distinguished in breast cancer tissue. A fourth cluster of eenes 
more highly expressed in all of the cell lines than in any of the 
clinicalspecimens, was enriched for genes present in the 'prolif- 
eration cluster described above (Fig. id). The variation in 
expression of these genes likely paralleled the difference in prolif- 
eration rate between the rapidly cycling cultured cell lines and the 
much more slowly dividing cells in tissues. 

Discussion 

Newly available genomics tools allowed us to explore variation in 
gene expression on a genomic scale in 60 cell lines derived from 
diverse tumour tissues. We used a simple duster analysis to iden- 
tify the prominent features in the gene expression patterns that 
appeared to reflect 'molecular signatures' of the tissue from 
which the cells originated. The histological characteristics of the 
cell lines that dominated the clustering were pervasive enough 
that s,milar relationships were revealed when alternative subsets 
of genes were selected for analysis. Additional features of the 
expression pattern may be related to variation in physiological 
attributes such as proliferation rate and activity of interferon- 
response pathways. 

The properties of the tumour-derived cell lines in this study 
have presumably all been shaped by selection for resistance to 
host defences and chemotherapeutics and for rapid proliferation 
in the tissue culture environment of synthetic growth media, fetal 
bovine serum and a polystyrene substratum. But the primary 
identifiable factor accounting for variation in gene expression 
patterns among these 60 ceD lines was the identity of the tissue 
from which each cell line was ostensibly derived. For most of the 
cell lines we examined, neither physiological nor experimental 
adaptation for growth in culture was sufficient to overwrite the 
gene expression programs established during differentiation in 
vivo. Nevertheless, the prominence of mesenchymal features in 
the cell lines isolated from glioblastomas and carcinomas may 
reflect a selection for the relative ease of establishment of eeU 
lines expressing stromal characteristics, perhaps combined with 
physiological adaptation to tissue culture conditions M_40 . 



Biological themes linking gene s with related expression pat- 
terns may be inferred in many cases from the shared attribute of 
known genes within the clusters. Uncharacterized cDNAs are 
likely to encode proteins that have roles similar to those of the 
known gene products with which they appear to be co-regulated. 
Sul Lfor several dustersof genes, we were unable to discernVcom- 
mon theme linking the identified members of the duster. Further 
exploration of their variation in expression under more diverse 

o^ oTrNn^ 0 ^ COmprehen f' invest j8 ati °" ^ *e physiol- 
ogy of the NC160 cells may provide insight'". The relationship of 
the gene expression patterns to the drug sensitivity patterns mea- 
sured by the DTP is an example of linking variation i, ~™ 
expression with more subtle and diverse phenotypic variation' '. 

i he patterns of gene expression measured in the NCI60 cell 
lines prov.dea framework that helps to distinguish the cells that 
express specific sets of genes in the histologically complex breast 
cancer specimens"'. Although it is now feasible to analyse gene 
expression in micro-dissected tumour specimens* 2 - 43 , this obser- 

s V o m °e n of Z K ' hat ? r" b f P0SSible ,0 "P' 0 " 4nd in "T"t 
some of the biology of clinical tumour samples by sampling them 

T£ K * 1 m *™ L" convemionaJ morphological patholofy, one 
might be able to observe interactions between a tumour and its 
rnicroenv.ronment in this way. These relationships will be dari- 
fied by sunable analysisof gene expression patterns from intact as 
well as dissected tumours 1214 -' 5 ' 41 . 

Methods 

^ C !r neS ,K Wt ° b ' ainfd ' he 9 ' 703 human cDNA don « Gen«- 
w) wed m these expenment, as bacterial colonies in 96-well microtitre 
plates'. Approxjma.e.y 8.000 distinct Unigene cluster, (representing „on£ 

fieS H T" ?eneS) WCTt rt P resemed in *>* «« of done,. All gene, identi- 
fied here by name represent done, whose identities were confirmed by re- 
sequent ng. o, by ,he cn.eria ,h„ two or more independen. cDNA clone, 

rr^rT"" 8 ! " mf ge "' had idfn,i »' *™ «pres,ion 
cZ , .' S ' ng T" r StqUen " «« """.pted for every 

clone after re-s.reaking for single colonies. For a subset of gene, for whkh 
qualnv 3 sequence was no, obtained, we attempted ,o confirm iden i«7„ by 
5 » qu enc,ng Of ,he subse, of clone, selec.ed for 5" sequence verification 
m (AebHu of an m.eresung pai.ern of expression (888 tota]).331 were cor- 
rectly .dennfied. 5/. mcorreclly identified, and 500. indelerminate (poor 
^sequence). We e,,^^^ 

-3.000 clone, have been venfied. The full lis, of clones used and their nomi- 
na .denimes are available (gene name, preceded by the de,igna,ion "SID*" 
(Stanford Idem.ficanon) represent clones whose iden.i.ie, have nol yet been 
verified; http://genome-www.stanford.edu:8000/nci60). 
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TroduTed 7, ""croarrays. The arrays used in thi, experiment were 
amnlifilH T nC -, <n ° W InCy,t Ph "™«u«ical,). Each in«r. was 

o«fol pro C ° l0ny bY MmpUng 1 Ml of Serial media and 

Z'T * ? ' of ,ht insfn u,ing coni^n,u, P rime " {ot 

crrtZ-Z dl / I f P resen,ed «>>* ''one se, (S-TTGTAAAACGACG 
CCCACTC-3 . $ -CACACAGG AAACAGCTATG-3 ' ) . Each PCR product 
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iSSj? .^r^ ed W gd aduSi0n - "nantnilrf ami resuspended in 
3XSSC (10 M|). The PGR product, w« then primed on tre^l^ 
muTOKopt dKfa uang a robot with four printing tips. Detailed protocols 
for assemblingand operating ■ microarray printer, and printing and aper- 

SiSS^Sf DNA ~ (h »" : ^ 

PrepanUon ofmRNA and reference pool. Cell lino were grown from Nd 
^l^Z'^f " ^->™"W*n«ed with phenol red. glutamine 
• n ^ ) " ld L %feUialfs ! r r- To minin > i * *• contribution of variations 
in culture condiuoru or cell density to differential gene expression, we grew 
each eel line to 80% confluence and Elated mRNA 24 h after «a™f™ 
fresh medium. The time between removal from the incubator and 1x515 of the 
celbinRNA stabihration buffer was minimal (<1 ,«„). Ceu, were^sed 
"flu"^ 8 8 u ? £ nidi . u,n "ornate and total RNA was purified 
wrth the RNeasy purification kit (Qiagen). We purified mRNA as needed 



using a po)y(A) purification kit (Olisotcr, Oiaeenl .„ . u 

mtegnry and rebuve contaminauon ofmRNA with ribojomaJ "* 

small p,eces (-50-100 mgeach). immediately placed into lO^TrnTof Tri 
jo reagent (CTko-BRI) ,„d homogeni J3, . £j 
Homogenuer (Ftsher Scientific), starting at 5.000 r.p.m. and graduaUv 

zol/tumour homogenate as described in the Triml nm .^T . 
initial step «» remoWat Once totaS 



^1 



451 



.. CDJIUCntwrTGlN 



ntopowrosm ALPHA 

W a OOUAOENMOMG MttTtM 2 



CTSL CATMfPftmi 

^.ocjiTin hto«c 

T.P>1 TCXT AMOTION PRO! EM 1 



■MSI CA 7X3-3 fta-CAM) 

OORii DtSCOOH OOMAtt PCCXFIOR 



KiKAIM | 

CWtOCALCM. OWM U l.TYPl ALPma.1 



Ml 



UGU KMMt GlAMT LARVA* mtM w 

^WO^A7,MO«OIOC 



tumour cell cluster 



CMS1 



« Syndrome i 

- L^^Jt^ M 'OXYGENASE • ACT (VA TMC 



^1 



fa 



*CYAS SMALL PCfcJC»XC CTTOK*« AS 
AWMKMlt 

'«RO*last GROWTH 'ACTOR 1 
UMTtfUlAXMI 
CAVt CAVICLM t 

CAlDi CALOCSMQn t 
CALOt CALCKSMON 1 
CALDl CALOt IM ON » 

ssssssss^ fAciom ■"■>• » 

«m*tjiCiucocom coc rcculateo ksmase 

«* KMUMCLUCOm COO «C»A>TtO KNAU 
C1CF CMCT.VE T«« GROWTH ^ACTO- 
tOX IYSTV OXDAK 

COVAA1 COUAGIM TTP>t tV Al*>H* 1 
Ca*A1 COLLACCN TVFE IV ALPHA 1 
DAU OOAlLiO HOMOLC 2 
t«»»1 im»«t4AL MEMVUAC PROTEM t 
CAMOCALPAM] 



•tw W»SOaA10» MMBTTOi) 



■rv.. - — — — I <L*A-11 

JgJ' CTP»A»€ACTWAlMi«« 0 T»»l 
COM AMTtQCN 

MVXOMKUS MKUS1AMCI 3 

HCI PMOVHOLVASf C EPM.ON 



leukocyte clusters 




oacu OMCMntcooNnoN eowio i 



■METAMTOCM 
COCS AMIaStn 
TOPIA DMA TOMjOOkCIUU k 

co«*cra«A 

««lltAA«L«*.l 

auakaimoTiMiMii, 

HJMKTATuauM 
TIMKIATUWAM 



Ir4 

& stromal cell cluster 



AH*. AR Yl HYOROCAAftONFtfOEPIO* 

MCSF CCLOWV-STMULATMC FACTOR 1 

C**T1 CtlAMASE 1 

MTERFEROM.MOUCZO M 
CjAtGA* AiNCTlONAL»««A t 

ALCAM ACt (VATEO IIUCOCYTK CUL ACMUOM itftiai « 

MISIHOMOLOC 

ATCtCAMkAAACTM 

COIAMtKilN 

^ ^^^ A »«««NACTWATOI(l« 0 »i»^ )ttc - MOB 
»BLN 1 'aJULM 

O^TU3 0>>«rr W »vRtMtO(MASE RELATED RHOTEM-3 
R*« RC 1MOC AGO RtCERlOR KU 
kWOCK S»AJtOo S TEO«C.» 

COLt>4 COU.ACEN TVPf V ALPHA J 
N* t M FACTOR COMPLEMENT laxf ^ 
TMt.t AMTtCEN 
ERDAJ C*C WPf AT OOMAIN 
TKTt ANTIC*** 

UCT »10 VW.&YCOJVl TRANS* ERAU J 110 

duspi dual p»ecifc<tt phosphaiase m.«*«».i 

OPmso«TOROPTR»WHASf RtLATEOPROIlPi.3 
MRiPni(tLm K pa.,R fC £PTOR.TYP( l 
CO. JA1 COUACEN Ivpe 3 alpha i 
LUMLtMCAN 

TCM3 TRANSGL l/TAMMASE ) 



- cociw ew. omuon cm* im 



proliferation cluster 



Fig. 5 Histologic features of breast cancer biopsies can be recoonjypd 

diagram in Fig. 4 showing gene clusters enriched for genes e«Dre«#ri i« h J"'*? b »* ed ° n 9 '" e ex P rejsion Patterns. Enlargements of the reoiom of th. ,x . 
cultured cel. lines... A cluster inc.uding many genes ^^^Z^IT ^ h " dis ^9u*s £ S th th" 
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We combined mRNA from the following cells in equal quantities to 
make the reference pool: HL-60 (acute myeloid leukaemia) and K562 
(chronic myeloid leukaemia); NCI-H226 (non-small-cell-lung); COLO 
205 (colon); SNB-I9 (central nervous system); LOX-IMV] (melanoma)- 
OVCAR-3 and OVCAR-4 (ovarian); CAKJ-1 (renal); PC -3 (prostate); and 
MCF7 and Hs578T (breast). The criterion for selection of the ceD lines in 
the reference are described in detail in the accompanying manuscript l2 . 

Doubling-time calculations. We calculated doubling times based on rou- 
tine NC360 ceD line compound screening data; and they reflect the dou- 
bling times for cells inoculated into %-well plates at the screening inocula- 
tion densities and grown in RPMI 1640 medium supplemented with 5% 
fetal bovine serum for 48 h. We measured ceD populations using sulforho- 
damme B optical density measurement assay. The doubling time constant k 
was calculated using the equation: N/No = e kl , where No is optical density 
for control (untreated) cells at time zero, N is optical density for control cells 
after 48-h incubation, and t is 48 h. The same equation was then used with the 
derived k to calculate the doubling time t by setting N/No = 2. For a given cell 
line, we obtained No and N values by averaging optical densities (N>6 000) 
obtained for each ceD line for a year's screening. Data and experimental details 
are available (http://dtp.nci.nih.gov). 



at hnp^/rana stanford.edu/software). Each spot was denned by manual 
positioning of a grid of circles over the array image. For each fluorescent 
image the average pixel intensity within each circle was determined, and a 
local background was computed for each spot equal to the median pixel 
intensity in a square of 40 pixels in width and height centred on the spot 
centre, excluding all pixels within any defined spots. Net signal was deter- 
mined by subtraction of this local background from the average intensity 
for each spot. Spots deemed unsuitable for accurate quantitation because 
of array artefacts were manually flagged and excluded from further analy- 
sis. Data files generated by ScanAiyze were entered into a custom database 
that maintains web-accessible files. Signal intensities between the two fluo- 
rescent images were normalized by applying a uniform scale factor to ail 
intensities measured for the Cy5 channel. The normalization factor was 
chosen so that the mean log(Cy3/Cy5) for a subset of spots that achieved a 
minimum qual.ty parameter (approximately 6,000 spots) was 0. This effec- 

hli\ c w ^""^-wcighted average' spot on each array to 
nave a Cy3/Cy5 ratio of 1.0. 



Preparation and hybridization of fluorescent labelled cDNA. For each 
comparative array hybridization, labelled cDNA was synthesized by reverse 
transcription from test cell mRNA in the presence of CyS-dUTT, and from 
the reference mRNA with Cy3-dUTP ( using the Superscript II reverse-tran- 
scription kit (Gibco-BRL). For each reverse transcription reaction, mRNA 
(2 Mg) was mixed with an anchored oligo-dT (d-20T-d(AGC)) primer (4 
ug) in a total volume of 15 ul, heated to 70 °C for 10 min and cooled on ice 
To this sample, we added an unlabeled nucleotide pool (0 6 ul- 25 mM 
each dATP, dCTP. dGTP, and 15 mM dTTP), either Cy3 or Cy5 conjugated 
dUTP (3 Ul; 1 mM; Amersham), Sxfirst-strand buffer (6 Ml; 250 mM Tris 
HCL, pH 8.3. 375 mM KG, 15 mM Mgd 2 ). 0.1 M DTT (3 ul) and 2 ul of 
Superscript II reverse transcriptase (200 u/uj). After a 2-h incubation at 42 
°C, the RNA was degraded by adding 1 N NaOH (1.5 ul) and incubating at 
70 °C for 10 min. The mixture was neutralized by adding of J N HCL ( 1 5 
Ml), and the volume brought to 500 Ml with TE ( 1 0 mM Tris, I mM EDTA ) 
We added Cot! human DNA (20 Mg; Gibco-BRL), and purified the probe 
by centrifugation in a Centricon-30 micro-concentrator (Amicon). The 
two separate probes were combined, brought to a volume of 500 Ml, and 
concentrated again to a volume of less than 7 Ml. We added 10 ue/ul 
poly(A) RNA (1 Ml; Sigma) and tRNA (10 Mg/Ml; Gibco-BRL) were added 
and adjusted the volume to 9.5 Ml with distilled water. For final probe 
preparation, 20xSSC (2.1 Mi; 1-5 M Nad, 150 mM NaCitrate. pH 8 0) and 
10% SDS (0.35 Ml) were added to a total final volume of 12 Ml The probes 
were denatured by heating for 2 min at 100 *C incubated at 37 "C for 
20-30 min, and placed on the array under a 22 mmx22 mm glass coverslip 
We incubated slides overnight at 65 °C for 14-18 h in a custom slide cham- 
ber with humidity maintained by a small reservoir of 3xSSC Arrays were 
washed by submersion and agitation for 2-5 min in 2xSSC with 0 1 % SDS 
followed by JxSSC and then O.lxSSC The arrays were "spun dry" by cen- 
trifugation for 2 min in a slide-rack in a Beckman GS-6 tabletop centrifuge 
in Microplus carriers at 650 r.p.m. for 2 min. 

Array quantitation and data processing. Following hybridization, arrays 
were scanned using a laser-scanning microscope (ref. 17; http://cmgm. 
stanford.edu/pbrown). Separate images were acquired for Cy3 and Cy5 We 
carried out data reduction with the program ScanAlyze (M.B.E., available 



Cluster analysis. We extracted tables (rows 0 f genes, columns of individual 
rnicroarray hybridizations) of normalized fluorescence ratios from the data- 
base. Various selection criteria, discussed in relation to each data set, were 
applied to select subsets of genes from the 9.703 cDNA elements on the 
arrays. Before clustering and display, the logarithm of the measured fluores- 
cence ratios for each .gene were centred by subtracting the arithmetic mean of 
all ranos measured for that gene. The centring makes all subsequent analyses 
independent of the amount of each gene s mRNA in the reference pool 

We applied a hierarchical clustering algorithm separately to the cell lines 
and genes using the Pearson correlation coefficient as the measure of simi- 
larity and average linkage clustering^*-". The results of this process are 
two dendrograms (trees), one for the ceD lines and one for the genes, in 
which very similar elements are connected by short branches, and longer 
branches join elements with diminishing degrees of similarity. For visual 
display the rows and columns in the initial data table were reordered to 
conform to the structures of the dendrograms obtained from the cluster 
analysis. Each cell in the cluster-ordered data table was replaced by a graded 
colour (pure red through black to pure green), representing the mean- 
adjusted ratio value in the cell. Gene labels in cluster diagrams are dis- 
played here only for genes that were represented in the rnicroarray by 
sequence-verified cDNAs. A complete software implementation of this 
process is available (http://rana.stanford.edu/software), as well as all clus- 
tering results (http://genome-www.stanford.edu/nci60). 
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